{ "cells": [ { "cell_type": "markdown", "id": "25c8380d", "metadata": {}, "source": [ "# Python API Example - Price Release Data Import and Storage in Dataframe\n", "\n", "This guide is designed to provide an example of how to access the Spark API:\n", "- The path to your client credentials is the only input needed to run this script (just before Section 2)\n", "- This script has been designed to display the raw outputs of requests from the API, and then shows you how to format those outputs to enable easy reading and analysis\n", "- This script can be copied and pasted by customers for quick use of the API\n", "- Once comfortable with the process, you can change the variables that are called to produce your own custom analysis products. (Section 2 onwards in this guide).\n", " \n", "__N.B. This guide is just for Price release data. If you're looking for other API data products (such as Freight routes or Netbacks), please refer to their according code example files.__ \n" ] }, { "cell_type": "markdown", "id": "f555e29c", "metadata": {}, "source": [ "### Have any questions?\n", "\n", "If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:\n", "\n", "__data@sparkcommodities.com__\n", "\n", "or refer to our API website for more information about this endpoint: https://www.sparkcommodities.com/api/lng-freight/contracts.html" ] }, { "cell_type": "markdown", "id": "84f27f99", "metadata": {}, "source": [ "## 1. Importing Data\n", "\n", "Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.\n", "\n", "This section can remain unchanged for most Spark API users." ] }, { "cell_type": "code", "execution_count": null, "id": "6fe087d8", "metadata": {}, "outputs": [], "source": [ "import json\n", "import os\n", "import sys\n", "import pandas as pd\n", "import numpy as np\n", "from base64 import b64encode\n", "from pprint import pprint\n", "from urllib.parse import urljoin\n", "import datetime\n", "from io import StringIO\n", "\n", "\n", "try:\n", " from urllib import request, parse\n", " from urllib.error import HTTPError\n", "except ImportError:\n", " raise RuntimeError(\"Python 3 required\")\n", "\n", "\n", "API_BASE_URL = \"https://api.sparkcommodities.com\"\n", "\n", "\n", "def retrieve_credentials(file_path=None):\n", " \"\"\"\n", " Find credentials either by reading the client_credentials file or reading\n", " environment variables\n", " \"\"\"\n", " if file_path is None:\n", " client_id = os.getenv(\"SPARK_CLIENT_ID\")\n", " client_secret = os.getenv(\"SPARK_CLIENT_SECRET\")\n", " if not client_id or not client_secret:\n", " raise RuntimeError(\n", " \"SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required\"\n", " )\n", " else:\n", " # Parse the file\n", " if not os.path.isfile(file_path):\n", " raise RuntimeError(\"The file {} doesn't exist\".format(file_path))\n", "\n", " with open(file_path) as fp:\n", " lines = [l.replace(\"\\n\", \"\") for l in fp.readlines()]\n", "\n", " if lines[0] in (\"clientId,clientSecret\", \"client_id,client_secret\"):\n", " client_id, client_secret = lines[1].split(\",\")\n", " else:\n", " print(\"First line read: '{}'\".format(lines[0]))\n", " raise RuntimeError(\n", " \"The specified file {} doesn't look like to be a Spark API client \"\n", " \"credentials file\".format(file_path)\n", " )\n", "\n", " print(\">>>> Found credentials!\")\n", " print(\n", " \">>>> Client_id={}, client_secret={}****\".format(client_id, client_secret[:5])\n", " )\n", "\n", " return client_id, client_secret\n", "\n", "\n", "def do_api_post_query(uri, body, headers):\n", " url = urljoin(API_BASE_URL, uri)\n", "\n", " data = json.dumps(body).encode(\"utf-8\")\n", "\n", " # HTTP POST request\n", " req = request.Request(url, data=data, headers=headers)\n", " try:\n", " response = request.urlopen(req)\n", " except HTTPError as e:\n", " print(\"HTTP Error: \", e.code)\n", " print(e.read())\n", " sys.exit(1)\n", "\n", " resp_content = response.read()\n", "\n", " # The server must return HTTP 201. Raise an error if this is not the case\n", " assert response.status == 201, resp_content\n", "\n", " # The server returned a JSON response\n", " content = json.loads(resp_content)\n", "\n", " return content\n", "\n", "\n", "def do_api_get_query(uri, access_token, format='json'):\n", " \"\"\"\n", " After receiving an Access Token, we can request information from the API.\n", " \"\"\"\n", " url = urljoin(API_BASE_URL, uri)\n", "\n", " if format == 'json':\n", " headers = {\n", " \"Authorization\": \"Bearer {}\".format(access_token),\n", " \"Accept\": \"application/json\",\n", " }\n", " elif format == 'csv':\n", " headers = {\n", " \"Authorization\": \"Bearer {}\".format(access_token),\n", " \"Accept\": \"text/csv\"\n", " }\n", " else:\n", " raise ValueError('The format parameter only takes `csv` or `json` as inputs')\n", "\n", " # HTTP GET request\n", " req = request.Request(url, headers=headers)\n", " try:\n", " response = request.urlopen(req)\n", " except HTTPError as e:\n", " print(\"HTTP Error: \", e.code)\n", " print(e.read())\n", " sys.exit(1)\n", "\n", " resp_content = response.read()\n", " #status = response.status\n", "\n", " # The server must return HTTP 200. Raise an error if this is not the case\n", " assert response.status == 200, resp_content\n", "\n", " # Storing response based on requested format\n", " if format == 'json':\n", " content = json.loads(resp_content)\n", " elif format == 'csv':\n", " content = resp_content\n", "\n", " return content\n", "\n", "\n", "def get_access_token(client_id, client_secret):\n", " \"\"\"\n", " Get a new access_token. Access tokens are the thing that applications use to make\n", " API requests. Access tokens must be kept confidential in storage.\n", "\n", " # Procedure:\n", "\n", " Do a POST query with `grantType` and `scopes` in the body. A basic authorization\n", " HTTP header is required. The \"Basic\" HTTP authentication scheme is defined in\n", " RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded\n", " using base64.\n", " \"\"\"\n", "\n", " # Note: for the sake of this example, we choose to use the Python urllib from the\n", " # standard lib. One should consider using https://requests.readthedocs.io/\n", "\n", " payload = \"{}:{}\".format(client_id, client_secret).encode()\n", " headers = {\n", " \"Authorization\": \"Basic {}\".format(b64encode(payload).decode()),\n", " \"Accept\": \"application/json\",\n", " \"Content-Type\": \"application/json\",\n", " }\n", " body = {\n", " \"grantType\": \"clientCredentials\",\n", " }\n", "\n", " content = do_api_post_query(uri=\"/oauth/token/\", body=body, headers=headers)\n", "\n", " print(\n", " \">>>> Successfully fetched an access token {}****, valid {} seconds.\".format(\n", " content[\"accessToken\"][:5], content[\"expiresIn\"]\n", " )\n", " )\n", "\n", " return content[\"accessToken\"]\n" ] }, { "cell_type": "markdown", "id": "fd3171a8", "metadata": {}, "source": [ "## N.B. Credentials\n", "\n", "Here we call the above functions, and input the file path to our credentials.\n", "\n", "N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already. Instructions for downloading your credentials can be found here:\n", "\n", "https://www.sparkcommodities.com/api/request/authentication.html\n", "\n", "\n", "The code then prints the available prices that are callable from the API, and their corresponding Python ticker names are displayed as a list at the bottom of the Output." ] }, { "cell_type": "code", "execution_count": null, "id": "fd7e89bf", "metadata": {}, "outputs": [], "source": [ "# Insert file path to your client credentials here\n", "client_id, client_secret = retrieve_credentials(file_path=\"/tmp/client_credentials.csv\")\n", "\n", "# Authenticate:\n", "access_token = get_access_token(client_id, client_secret)" ] }, { "cell_type": "markdown", "id": "3638e7a7", "metadata": {}, "source": [ "## 2. Listing all available contract tickers" ] }, { "cell_type": "code", "execution_count": null, "id": "ce36499b", "metadata": {}, "outputs": [], "source": [ "# Define function for listing contracts from API\n", "def list_contracts(access_token):\n", " \"\"\"\n", " Fetch available contracts. Return contract ticker symbols\n", "\n", " # Procedure:\n", "\n", " Do a GET query to /v1.0/contracts/ with a Bearer token authorization HTTP header.\n", " \"\"\"\n", " content = do_api_get_query(uri=\"/v1.0/contracts/\", access_token=access_token)\n", "\n", " print(\">>>> All the contracts you can fetch\")\n", " tickers = []\n", " for contract in content[\"data\"]:\n", " print(contract[\"fullName\"])\n", " tickers.append(contract[\"id\"])\n", "\n", " return tickers\n", "\n", "# Fetch all contracts:\n", "tickers = list_contracts(access_token)\n", "\n", "print(tickers)" ] }, { "cell_type": "markdown", "id": "fc9cf152", "metadata": {}, "source": [ "## 3. Latest Price Release\n", "\n", "Here we call the latest price release." ] }, { "cell_type": "code", "execution_count": null, "id": "d026eb33", "metadata": {}, "outputs": [], "source": [ "## Defining the function\n", "\n", "\n", "def fetch_latest_price_releases(access_token, ticker):\n", " \"\"\"\n", " For a contract, fetch then display the latest price release\n", "\n", " # Procedure:\n", "\n", " Do GET queries to /v1.0/contracts/{contract_ticker_symbol}/price-releases/latest/\n", " with a Bearer token authorization HTTP header.\n", " \"\"\"\n", " content = do_api_get_query(\n", " uri=\"/v1.0/contracts/{}/price-releases/latest/\".format(ticker),\n", " access_token=access_token,\n", " )\n", "\n", " return content[\"data\"]\n", "\n", "\n", "## Calling that function and storing the output\n", "\n", "# Here we store the entire dataset called from the API\n", "\n", "my_dict = fetch_latest_price_releases(access_token, 'spark30s')" ] }, { "cell_type": "code", "execution_count": null, "id": "04e61ee1", "metadata": {}, "outputs": [], "source": [ "# Shows how the raw output is formatted\n", "print(my_dict)" ] }, { "cell_type": "markdown", "id": "a0e0e030", "metadata": {}, "source": [ "## 3. Historical Prices\n", "\n", "Here we perform a similar task, but with historical prices instead. This is done using the URL:\n", "\n", "__/v1.0/contracts/{contract_ticker_symbol}/price-releases/{limit}{offset}__\n", "\n", "First we define the function that imports the data from the Spark API.\n", "\n", "We then call that function, and define 2 parameters:\n", "- 'tickers': which ticker do you want to call.\n", " - We define the variable 'my_ticker' after the function definition, and set this to 'tickers[2]' which corresponds to Spark25s\n", " - Alter this variable to whatever price product you need.\n", "- 'limit': this allows you to control how many datapoints you want to call. We use 'limit=10', which means we have called the last 10 datapoints (the Spark25 spot price for the last 10 business days).\n", " - For __Premium__ Users, alter this limit to however many datapoints you need.\n", " - For __Trial__ Users, the limit parameter must not exceed 14 datapoints, as historical data is limited to 2 weeks for this plan.\n", "\n", "\n", "We save the output as a local variable called 'my_dict_hist'." ] }, { "cell_type": "code", "execution_count": null, "id": "ff4d0dcc", "metadata": {}, "outputs": [], "source": [ "def fetch_historical_price_releases(access_token, ticker, limit=4, offset=None):\n", " \"\"\"\n", " For a selected contract, this endpoint returns all the Price Releases you can\n", " access according to your current subscription, ordered by release date descending.\n", "\n", " **Note**: Unlimited access to historical data and full forward curves is only\n", " available to those with Premium access. Get in touch to find out more.\n", "\n", " **Params**\n", "\n", " limit: optional integer value to set an upper limit on the number of price\n", " releases returned by the endpoint. Default here is 4.\n", "\n", " offset: optional integer value to set from where to start returning data.\n", " Default is 0.\n", "\n", " # Procedure:\n", "\n", " Do GET queries to /v1.0/contracts/{contract_ticker_symbol}/price-releases/\n", " with a Bearer token authorization HTTP header.\n", " \"\"\"\n", " print(\">>>> Get price releases for {}\".format(ticker))\n", "\n", " query_params = \"?limit={}\".format(limit)\n", " if offset is not None:\n", " query_params += \"&offset={}\".format(offset)\n", "\n", " content = do_api_get_query(\n", " uri=\"/v1.0/contracts/{}/price-releases/{}\".format(ticker, query_params),\n", " access_token=access_token,\n", " )\n", "\n", " my_dict = content[\"data\"]\n", "\n", " return my_dict" ] }, { "cell_type": "markdown", "id": "0049914a", "metadata": {}, "source": [ "### N.B. Plan Limits\n", "\n", "__Premium__ Plan users have __no__ limits on historical data.\n", "\n", "__Trial__ Plan users only have access to the latest 2 weeks worth of historical data. Therefore the limit parameter cannot exceed 14." ] }, { "cell_type": "code", "execution_count": null, "id": "4836aa94", "metadata": {}, "outputs": [], "source": [ "### Define which price product you want to retrieve\n", "ticker = 'spark30s'\n", "\n", "my_dict_hist = fetch_historical_price_releases(access_token, ticker, limit=10)" ] }, { "cell_type": "markdown", "id": "99be9416", "metadata": {}, "source": [ "### Formatting into a Pandas DataFrame\n", "\n", "The outputted data has several nested lists and dictionaries. If we are aware of what variables we want, we can externally store these values as lists and create a Pandas DataFrame.\n", "\n", "Within a new dictionary, we create empty lists for variables:\n", "- Release Dates\n", "- Start of Period\n", "- Ticker\n", "- Price in dollars/day\n", "- Price in dollars/MMBtu\n", "- The spread of the data used to calculate the Spot Price\n", " - Min\n", " - Max\n", "\n", "The dictionary is then transformed into a Pandas Dataframe for readability and ease of use. \n", "\n", "## N.B. \n", "This JSON structure is not consistent across all datasets, and so might need to be amended when calling other Spark contracts." ] }, { "cell_type": "code", "execution_count": null, "id": "27782530", "metadata": {}, "outputs": [], "source": [ "# Defining the function for storing and formatting the data into a Pandas DataFrame\n", "\n", "def store_and_format(dict_hist):\n", " stored_data = {\n", " \"ticker\": [],\n", " \"Period Start\": [],\n", " \"USDperday\": [],\n", " \"USDperdayMax\": [],\n", " \"USDperdayMin\": [],\n", " \"USDperMMBtu\": [],\n", " \"Release Date\": []\n", " }\n", "\n", " for release in dict_hist:\n", " release_date = release[\"releaseDate\"]\n", " stored_data['ticker'].append(release[\"contractId\"])\n", "\n", " stored_data['Release Date'].append(release_date)\n", "\n", " data_points = release[\"data\"][0][\"dataPoints\"]\n", "\n", " for data_point in data_points:\n", " period_start_at = data_point[\"deliveryPeriod\"][\"startAt\"]\n", " stored_data['Period Start'].append(period_start_at)\n", "\n", " stored_data['USDperday'].append(data_point[\"derivedPrices\"][\"usdPerDay\"][\"spark\"])\n", " stored_data['USDperdayMin'].append(data_point[\"derivedPrices\"][\"usdPerDay\"][\"sparkMin\"])\n", " stored_data['USDperdayMax'].append(data_point[\"derivedPrices\"][\"usdPerDay\"][\"sparkMax\"])\n", "\n", " stored_data['USDperMMBtu'].append(data_point[\"derivedPrices\"][\"usdPerMMBtu\"][\"spark\"])\n", " \n", " historical_df = pd.DataFrame(stored_data)\n", " \n", " historical_df[\"USDperday\"] = pd.to_numeric(historical_df[\"USDperday\"])\n", " historical_df[\"USDperdayMax\"] = pd.to_numeric(historical_df[\"USDperdayMax\"])\n", " historical_df[\"USDperdayMin\"] = pd.to_numeric(historical_df[\"USDperdayMin\"])\n", "\n", " historical_df[\"USDperMMBtu\"] = pd.to_numeric(historical_df[\"USDperMMBtu\"])\n", " historical_df[\"Release Date\"] = pd.to_datetime(historical_df[\"Release Date\"])\n", " \n", " return historical_df" ] }, { "cell_type": "code", "execution_count": null, "id": "56aa19be", "metadata": {}, "outputs": [], "source": [ "# Running the function to store the values\n", "historical_df = store_and_format(my_dict_hist)\n", "historical_df.head()" ] }, { "cell_type": "markdown", "id": "838a25c1", "metadata": {}, "source": [ "# Analytics Gallery\n", "Want to gain market insights using our data?\n", "\n", "Take a look at our [Analytics Gallery](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html) on the Spark API website, which includes:\n", "\n", "- __Freight Spot Price Seasonality Chart__ - Compare freight spot prices across several years, to understand how the current market compares to historical prices at equivalent periods (e.g. Dec22 vs Dec23 vs Dec24).\n", "\n", "Want to create meaningful charts using our data?\n", "\n", "View our Freight Spot Price Seasonality Chart [here](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html). \n" ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }