Systemathics Tutorial - Ganymede API and Python Pandas DataFrame
14 January 2024
How to use Python dataframe with Ganymede API ?
Systemathics Tutorial - Ganymede API and Python Pandas DataFrame
Introduction
The Ganymede API, developed by Systemathics, grants access to a wide range of market data, including static, daily, intraday, tick, corporate actions data and so more. It handles different class assets such as equities, futures, forex, options, bonds ...
Ganymede API is designed with versatility in mind. It supports multiple programming languages, facilitated by gRPC. Whether you prefer Python, C#, C++, F#, Rust or another language, Ganymede API ensures a consistent and efficient user experience across diverse programming environments.
This tutorial will demonstrate how to harness the capabilities of Ganymede API alongside Python's Pandas DataFrame, allowing you to efficiently handle and analyze the retrieved market data.
Before we delve into the world of data manipulation and analysis using Pandas, let's start with the basics and explore the fundamental concept of a Pandas DataFrame.
What is a Pandas DataFrame?
A Pandas DataFrame is a two-dimensional, tabular data structure that is widely used for data manipulation and analysis in Python. It is a fundamental and powerful data structure provided by the Pandas library. The DataFrame is similar to a spreadsheet or SQL table, where data is organized in rows and columns.
Please keep in mind, while Pandas DataFrames are a popular and powerful tool for data manipulation and analysis in Python, there are several alternatives such as NumPy, Vaex ...Each alternative has its own set of advantages and trade-offs based on factors such as scalability, memory efficiency, parallel processing capabilities ...
1 Pandas Time Series DataFrame
1.1 Installation
Make sure Pandas is installed. If not, install it using:
pip install pandas
1.2 Importing Libraries
Import the required libraries. Below we will need pandas and numpy
import pandas as pd
import numpy as np
1.3 Creating a Time Series DataFrame
For the purpose of this tutorial, let's create a time series DataFrame using a date range and sample data.
# Creating a time series DataFrame
date_range = pd.date_range(start='2023-01-01', end='2023-01-31', freq='D')
time_series_data = np.random.randn(len(date_range), 2)
df = pd.DataFrame(time_series_data, columns=['Value1', 'Value2'], index=date_range)
# display the data
print(df)
output
Value1 Value2
2023-01-01 0.136937 -0.318777
2023-01-02 0.841878 0.141645
2023-01-03 0.630067 0.358567
2023-01-04 0.387476 -0.164671
2023-01-05 1.220184 -0.224055
2023-01-06 -0.759427 0.605909
2023-01-07 -1.786168 -0.951223
2023-01-08 0.472799 0.217767
2023-01-09 -1.291120 0.807408
2023-01-10 -0.138154 -1.084273
2023-01-11 1.626802 0.445217
2023-01-12 -0.769425 -2.207525
2023-01-13 2.639641 -1.064311
2023-01-14 1.741054 0.834897
2023-01-15 0.772527 0.033278
2023-01-16 -0.349433 -0.891260
2023-01-17 0.158650 1.862411
2023-01-18 -0.581017 -0.734269
2023-01-19 -0.126910 0.103833
2023-01-20 -0.456885 0.575853
2023-01-21 1.523911 1.533125
2023-01-22 -1.166604 -1.428796
2023-01-23 1.094989 -1.420081
2023-01-24 -0.887597 -2.376197
2023-01-25 -0.813943 1.363824
2023-01-26 1.651343 1.637309
2023-01-27 -0.002175 -2.157734
2023-01-28 2.581581 -1.007572
2023-01-29 -0.636318 -0.069670
2023-01-30 -0.613677 1.337950
2023-01-31 -0.347668 0.356858
1.4 Accessing Time Series Data
Accessing data using Python Pandas DataFrame involves various methods to select, filter, and manipulate the data within the DataFrame.
Accessing by Index:
# Accessing data for a specific date
selected_date = df.loc['2023-01-05']
print(selected_date)
output
Value1 1.220184
Value2 -0.224055
Name: 2023-01-05 00:00:00, dtype: float64
Accessing a Date Range:
# Accessing data for a date range
date_range_data = df.loc['2023-01-03':'2023-01-8']
print(date_range_data)
output
Value1 Value2
2023-01-03 0.630067 0.358567
2023-01-04 0.387476 -0.164671
2023-01-05 1.220184 -0.224055
2023-01-06 -0.759427 0.605909
2023-01-07 -1.786168 -0.951223
2023-01-08 0.472799 0.217767
For more details; the official documentation provides in-depth information about Pandas, including DataFrame, Series, and various functionalities: https://pandas.pydata.org/docs/
2 Ganymede API
Now that we have established a foundation with Pandas DataFrame, let's explore the next level of data retrieval by integrating the power of Ganymede API.
2.1 Install packages
Make sure you have the necessary libraries installed
pip install systemathics.apis==2.37.* --upgrade
2.2 Import the required libraries
The systemathics.apis package below brings in particular:
2.3 Handling Authentication
Get an access token
token = token_helpers.get_token()
2.4 Retrieve data
Create the request
request = daily_bars.DailyBarsRequest(identifier = identifier.Identifier(exchange = 'XNAS', ticker = 'AAPL'))
For comprehensive information on how to utilize the request, including an in-depth explanation of its various parameters and their usage, kindly consult the official documentation accessible at the following link https://ganymede.cloud/docs/index.html#systemathics/apis/services/daily/v1/dailybarsrequest
Get a gRPC channel, create a stub (i.e: client), send the request through the channel and collect the response
try:
# open a gRPC channel
with channel_helpers.get_grpc_channel() as channel:
# create the service stub
service = daily_bars_service.DailyBarsServiceStub(channel)
# send the request through the channel and receive the response
response = service.DailyBars(request = request, metadata = [('authorization', token)])
print("Total bars retrieved: ",len(response.data))
except grpc.RpcError as e:
display(e.code().name)
display(e.details())
output
Total bars retrieved: 9883
2.5 Analyse the response
To gain insights into the response, let's showcase its type and preview the initial items. For a comprehensive understanding of the request and response structure, refer to the documentation available at: https://ganymede.cloud/docs/index.html#systemathics/apis/services/daily/v1/dailybarsresponse
# display the type of the response
type(response)
output
systemathics.apis.services.daily.v1.daily_bars_pb2.DailyBarsResponse
The 'data' variable encapsulates individual daily bars, forming the core items within the dataset.
type(response.data)
output
google._upb._message.RepeatedCompositeContainer
Hereafter, we'll demonstrate accessing and presenting the initial three items within the data array.
data_range = response.data[0:3]
type(data_range)
output
list
As observed, the Ganymede API yields a straightforward data response in the form of a list. Manipulating this data requires only fundamental knowledge of Python. The simplicity of the response structure makes it accessible for users with basic Python proficiency to effectively handle and process the retrieved information.
print(data_range)
output
[date {
year: 1984
month: 9
day: 7
}
open: 0.1183
high: 0.12
low: 0.1172
close: 0.1183
volume: 667877760
score: 100
, date {
year: 1984
month: 9
day: 10
}
open: 0.1183
high: 0.1188
low: 0.1155
close: 0.1177
volume: 525593120
score: 100
, date {
year: 1984
month: 9
day: 11
}
open: 0.1188
high: 0.1222
low: 0.1188
close: 0.12
volume: 1219454848
score: 100
]
2.6 Convert the response to DataFrame
Hereafter, we'll outline the process of converting the 'data' into a DataFrame.
#Prepare the data frame content
dates=[datetime(b.date.year, b.date.month, b.date.day) for b in response.data]
opens = [b.open for b in response.data]
highs = [b.high for b in response.data]
lows = [b.low for b in response.data]
closes = [b.close for b in response.data]
volumes = [b.volume for b in response.data]
scores = [b.score for b in response.data]
d = {'Date': dates, 'Open': opens, 'High': highs, 'Low' : lows,'Close': closes, 'Volume': volumes, 'Score': scores }
df = pd.DataFrame(data=d)
df = df.set_index('Date')
2.7 Visualize the data
print(df)
output
Open High Low Close Volume Score
Date
1984-09-07 0.1183 0.1200 0.1172 0.1183 6.678778e+08 100.0
1984-09-10 0.1183 0.1188 0.1155 0.1177 5.255931e+08 100.0
1984-09-11 0.1188 0.1222 0.1188 0.1200 1.219455e+09 100.0
1984-09-12 0.1200 0.1205 0.1166 0.1166 1.069285e+09 100.0
1984-09-13 0.1228 0.1233 0.1228 0.1228 1.664229e+09 100.0
... ... ... ... ... ... ...
2023-11-17 190.2500 190.3800 188.5700 189.6900 5.094140e+07 100.0
2023-11-20 189.8900 191.9050 189.8800 191.4500 4.653860e+07 100.0
2023-11-21 191.4100 191.5200 189.7400 190.6400 3.813440e+07 100.0
2023-11-22 191.4900 192.9300 190.8250 191.3100 3.963000e+07 100.0
2023-11-24 190.8700 190.9000 189.2500 189.9700 2.404830e+07 100.0
[9883 rows x 6 columns
Conclusion
By combining Ganymede API's capabilities with Python's Pandas DataFrame, you can streamline the process of retrieving, processing, and analyzing market data. This tutorial aims to provide a foundational understanding of leveraging Ganymede API alongside Pandas for efficient and insightful data analysis.
Should you have any further questions, encounter challenges, or wish to explore specific topics in more detail, we remain available to assist. Feel free to reach out for any additional clarification or guidance.
Happy coding, and may your data analysis endeavors be both insightful and rewarding!