Getting Started

Sections

Theme switcher

Tanker Voyages Parquet Export

The Tanker Voyages Parquet Export API provides access to daily snapshot data in Parquet format for the TANKER market segment (tanker, chemical, LNG, LPG, FSO, and OBO vessels). This API allows you to download ZIP archives containing Parquet files for a specific date, where each archive represents a snapshot of voyage data as it existed on that date.

Parquet is a columnar storage format optimized for analytics workloads, making it ideal for large-scale data processing and analysis. The API supports HTTP caching via ETag semantics, allowing efficient incremental updates.

Each archive is a daily snapshot of voyage data, not a historical record. The snapshot contains the state of all voyages as they existed on the specified date.

Key Features

📦 Bulk Data Export – Download complete daily snapshot datasets for a specific date in Parquet format, optimized for analytics and data processing.

🗜️ ZIP Archive Format – Data is delivered as a ZIP archive containing one or more Parquet files, making it easy to download and process.

🔄 HTTP Caching – Support for ETag-based caching allows clients to efficiently check if data has changed without re-downloading unchanged archives.

📅 Daily Snapshots – Retrieve daily snapshots of voyage data for any date. Each snapshot represents the state of all voyages as they existed on that specific date.

Efficient Processing – Parquet format enables fast columnar queries and efficient compression, reducing storage and transfer costs.

This API is designed for bulk data retrieval and analytics use cases. For real-time or filtered queries, consider using the Tanker Voyages API instead.

Important: Each archive is a daily snapshot, not a historical record. To track changes over time, you need to download and compare multiple daily snapshots.

S3 Redirection

The API returns an HTTP redirect (302/307) to a signed S3 URL where the Parquet archive is stored. Clients must follow redirects to download the actual file. Most HTTP clients (including curl and requests) follow redirects automatically, but ensure your client is configured to do so.

Make sure your HTTP client follows redirects. The initial API response will be a redirect to the S3 bucket, and you must follow it to download the archive.

Example Requests

Basic Request

CURL
curl -X GET "https://apihub.axsmarine.com/tanker/voyage/parquet/v1?date=2025-12-17" \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -L \ -o voyages_2025-12-17.zip

The -L flag ensures curl follows redirects to the S3 bucket (this is the default behavior, but explicitly included for clarity).

Request with ETag Caching

First request:

CURL
curl -X GET "https://apihub.axsmarine.com/tanker/voyage/parquet/v1?date=2025-12-17" \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -L \ -D headers.txt \ -o voyages_2025-12-17.zip

Extract ETag from headers.txt (e.g., ETag: "abc123def456"), then use it in subsequent requests:

CURL
curl -X GET "https://apihub.axsmarine.com/tanker/voyage/parquet/v1?date=2025-12-17" \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "If-None-Match: \"abc123def456\"" \ -L \ -v

If the archive hasn't changed, you'll receive a 304 Not Modified response with no body.

Python Example with ETag Caching

Python
import requests import os url = "https://apihub.axsmarine.com/tanker/voyage/parquet/v1" headers = { "Authorization": "Bearer YOUR_API_TOKEN" } params = {"date": "2025-12-17"} # Check if we have a cached ETag etag_file = "etag_2025-12-17.txt" if os.path.exists(etag_file): with open(etag_file, "r") as f: etag = f.read().strip() headers["If-None-Match"] = etag # requests.get() follows redirects automatically by default response = requests.get(url, headers=headers, params=params, allow_redirects=True) if response.status_code == 304: print("Archive has not changed since last download") elif response.status_code == 200: # Save the ETag for next time if "ETag" in response.headers: with open(etag_file, "w") as f: f.write(response.headers["ETag"]) # Save the ZIP file with open("voyages_2025-12-17.zip", "wb") as f: f.write(response.content) print("Archive downloaded successfully") elif response.status_code == 404: print("Archive not found for the specified date") print(response.json())
else: print(f"Error: {response.status_code}") print(response.json())

Usage Patterns

Daily Snapshot Sync

For daily synchronization of voyage snapshot data:

  1. Download Today's Snapshot – Download the daily snapshot archive for the target date
  2. Store ETag – Save the ETag from the response headers
  3. Periodic Checks – On subsequent days, use the stored ETag with If-None-Match header to check if the snapshot has been updated
  4. Re-download if Changed – If you receive a 200 response, the snapshot has changed and should be processed

Multiple Daily Snapshots Analysis

To analyze voyage data across multiple dates using daily snapshots:

  1. Date Range Loop – Iterate through dates
  2. Download Snapshots – Download each date's snapshot archive
  3. Process Parquet Files – Extract and process Parquet files using tools like Pandas, Apache Spark, or DuckDB
  4. Compare Snapshots – Compare snapshots across dates to track changes in voyage states over time

Batch Processing

For batch processing workflows with multiple daily snapshots:

  1. Download Multiple Snapshots – Download snapshot archives for multiple dates in parallel
  2. Extract Parquet Files – Extract Parquet files from ZIP archives
  3. Load into Analytics Engine – Load Parquet files into your analytics platform (e.g., Apache Spark, DuckDB, BigQuery)
  4. Snapshot Analysis – Analyze and compare snapshots to understand voyage state changes over time

Parquet File Structure

The ZIP archive contains one or more Parquet files representing a daily snapshot. The exact structure may vary, but typically includes:

  • Voyage data with all fields from the Tanker Voyages API, representing the state of voyages on the specified date
  • Columnar format optimized for analytics queries
  • Compression for efficient storage

Daily Snapshots

Each archive represents a daily snapshot of voyage data. This means:

  • Snapshot Content: The archive contains the state of all voyages as they existed on the specified date
  • Not Historical Records: Each snapshot is independent and does not contain historical changes or updates
  • State at Date: Voyages are included with their state (status, location, etc.) as of the snapshot date
  • Time Series Analysis: To track changes over time, download and compare multiple daily snapshots

Market Segment

This API automatically filters results to the TANKER market segment, which includes:

  • tanker – Crude oil tankers
  • chemoil – Chemical tankers
  • chemical – Chemical carriers
  • lpg – LPG carriers
  • lng – LNG carriers
  • fso – Floating Storage and Offloading vessels
  • obo – Oil/Bulk/Ore carriers

The Parquet files can be large. Ensure you have sufficient storage and bandwidth for downloading and processing the archives. Use ETag caching to avoid unnecessary re-downloads.

Header Parameters

If-None-Matchstring

Optional ETag value received from a previous download. If the archive has not changed, the server will respond with 304 Not Modified.

Authorizationstring Required

Bearer token used for authentication.

Query Parameters

datestring Required

Date for which to retrieve the archive (e.g., 2025-12-17). Must be in format YYYY-MM-DD.

Pattern
^\d{4}-\d{2}-\d{2}$

Response

200
Object
ZIP archive containing Parquet files

Response Attributes

responsestring
304
Object
Not Modified - the archive has not changed since the ETag supplied in the `If-None-Match` header.
400
Object
Bad request - Query-parameter `date` is required and must be in format YYYY-MM-DD.

Response Attributes

@contextstring
@idstring
@typestring
titlestring
detailstring
statusinteger
typestring
descriptionstring
404
Object
Not Found - Archive file not found for date "2012-12-16".

Response Attributes

@contextstring
@idstring
@typestring
titlestring
detailstring
statusinteger
typestring
descriptionstring
500
Object
Internal Server error

Response Attributes

@contextstring
@idstring
@typestring
titlestring
detailstring
statusinteger
typestring
descriptionstring

GET

/

Select
1

Response