Systemathics Tutorial - Ganymede API and Python Pandas DataFrame

14 January 2024

How to use Python dataframe with Ganymede API ?

Systemathics Tutorial - Ganymede API and Python Pandas DataFrame

Introduction

The Ganymede API, developed by Systemathics, grants access to a wide range of market data, including static, daily, intraday, tick, corporate actions data and so more. It handles different class assets such as equities, futures, forex, options, bonds ...

Ganymede API is designed with versatility in mind. It supports multiple programming languages, facilitated by gRPC. Whether you prefer Python, C#, C++, F#, Rust or another language, Ganymede API ensures a consistent and efficient user experience across diverse programming environments.

This tutorial will demonstrate how to harness the capabilities of Ganymede API alongside Python's Pandas DataFrame, allowing you to efficiently handle and analyze the retrieved market data.

Before we delve into the world of data manipulation and analysis using Pandas, let's start with the basics and explore the fundamental concept of a Pandas DataFrame.

What is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional, tabular data structure that is widely used for data manipulation and analysis in Python. It is a fundamental and powerful data structure provided by the Pandas library. The DataFrame is similar to a spreadsheet or SQL table, where data is organized in rows and columns.

Please keep in mind, while Pandas DataFrames are a popular and powerful tool for data manipulation and analysis in Python, there are several alternatives such as NumPy, Vaex ...Each alternative has its own set of advantages and trade-offs based on factors such as scalability, memory efficiency, parallel processing capabilities ...

1 Pandas Time Series DataFrame

1.1 Installation

Make sure Pandas is installed. If not, install it using:

pip install pandas

1.2 Importing Libraries

Import the required libraries. Below we will need pandas and numpy


import pandas as pd
import numpy as np

1.3 Creating a Time Series DataFrame

For the purpose of this tutorial, let's create a time series DataFrame using a date range and sample data.


# Creating a time series DataFrame
date_range = pd.date_range(start='2023-01-01', end='2023-01-31', freq='D')
time_series_data = np.random.randn(len(date_range), 2)
df = pd.DataFrame(time_series_data, columns=['Value1', 'Value2'], index=date_range)

# display the data
print(df)

output


Value1    Value2
2023-01-01  0.136937 -0.318777
2023-01-02  0.841878  0.141645
2023-01-03  0.630067  0.358567
2023-01-04  0.387476 -0.164671
2023-01-05  1.220184 -0.224055
2023-01-06 -0.759427  0.605909
2023-01-07 -1.786168 -0.951223
2023-01-08  0.472799  0.217767
2023-01-09 -1.291120  0.807408
2023-01-10 -0.138154 -1.084273
2023-01-11  1.626802  0.445217
2023-01-12 -0.769425 -2.207525
2023-01-13  2.639641 -1.064311
2023-01-14  1.741054  0.834897
2023-01-15  0.772527  0.033278
2023-01-16 -0.349433 -0.891260
2023-01-17  0.158650  1.862411
2023-01-18 -0.581017 -0.734269
2023-01-19 -0.126910  0.103833
2023-01-20 -0.456885  0.575853
2023-01-21  1.523911  1.533125
2023-01-22 -1.166604 -1.428796
2023-01-23  1.094989 -1.420081
2023-01-24 -0.887597 -2.376197
2023-01-25 -0.813943  1.363824
2023-01-26  1.651343  1.637309
2023-01-27 -0.002175 -2.157734
2023-01-28  2.581581 -1.007572
2023-01-29 -0.636318 -0.069670
2023-01-30 -0.613677  1.337950
2023-01-31 -0.347668  0.356858

1.4 Accessing Time Series Data

Accessing data using Python Pandas DataFrame involves various methods to select, filter, and manipulate the data within the DataFrame.

Accessing by Index:


# Accessing data for a specific date
selected_date = df.loc['2023-01-05']
print(selected_date)

output


Value1    1.220184
Value2   -0.224055
Name: 2023-01-05 00:00:00, dtype: float64

Accessing a Date Range:


# Accessing data for a date range
date_range_data = df.loc['2023-01-03':'2023-01-8']
print(date_range_data)

output


Value1    Value2
2023-01-03  0.630067  0.358567
2023-01-04  0.387476 -0.164671
2023-01-05  1.220184 -0.224055
2023-01-06 -0.759427  0.605909
2023-01-07 -1.786168 -0.951223
2023-01-08  0.472799  0.217767

For more details; the official documentation provides in-depth information about Pandas, including DataFrame, Series, and various functionalities: https://pandas.pydata.org/docs/

2 Ganymede API

Now that we have established a foundation with Pandas DataFrame, let's explore the next level of data retrieval by integrating the power of Ganymede API.

2.1 Install packages

Make sure you have the necessary libraries installed

pip install systemathics.apis==2.37.* --upgrade

2.2 Import the required libraries

The systemathics.apis package below brings in particular:

token_helpers which is useful to get access tokens (Ganymede gRPC APIs are authenticated)

channel_helpers which is useful to get bidirectional secure communication channels (To the Ganymede gRPC API endpoints)

daily_bars services to retrieve historical daily data (open, high, low and close)

2.3 Handling Authentication

Get an access token

token = token_helpers.get_token()

2.4 Retrieve data

Create the request

request = daily_bars.DailyBarsRequest(identifier = identifier.Identifier(exchange = 'XNAS', ticker = 'AAPL'))

For comprehensive information on how to utilize the request, including an in-depth explanation of its various parameters and their usage, kindly consult the official documentation accessible at the following link https://ganymede.cloud/docs/index.html#systemathics/apis/services/daily/v1/dailybarsrequest

Get a gRPC channel, create a stub (i.e: client), send the request through the channel and collect the response


try:
    # open a gRPC channel
    with channel_helpers.get_grpc_channel() as channel:  
        
        # create the service stub
        service = daily_bars_service.DailyBarsServiceStub(channel)
        
        # send the request through the channel and receive the response
        response = service.DailyBars(request = request, metadata = [('authorization', token)])
        
    print("Total bars retrieved: ",len(response.data))
except grpc.RpcError as e:
    display(e.code().name)
    display(e.details())

output

Total bars retrieved:  9883

2.5 Analyse the response

To gain insights into the response, let's showcase its type and preview the initial items. For a comprehensive understanding of the request and response structure, refer to the documentation available at: https://ganymede.cloud/docs/index.html#systemathics/apis/services/daily/v1/dailybarsresponse


# display the type of the response
type(response)

output

systemathics.apis.services.daily.v1.daily_bars_pb2.DailyBarsResponse

The 'data' variable encapsulates individual daily bars, forming the core items within the dataset.

type(response.data)

output

google._upb._message.RepeatedCompositeContainer

Hereafter, we'll demonstrate accessing and presenting the initial three items within the data array.


data_range = response.data[0:3]
type(data_range)

output

list

As observed, the Ganymede API yields a straightforward data response in the form of a list. Manipulating this data requires only fundamental knowledge of Python. The simplicity of the response structure makes it accessible for users with basic Python proficiency to effectively handle and process the retrieved information.

print(data_range)

output


[date {
    year: 1984
    month: 9
    day: 7
    }
    open: 0.1183
    high: 0.12
    low: 0.1172
    close: 0.1183
    volume: 667877760
    score: 100
    , date {
    year: 1984
    month: 9
    day: 10
    }
    open: 0.1183
    high: 0.1188
    low: 0.1155
    close: 0.1177
    volume: 525593120
    score: 100
    , date {
    year: 1984
    month: 9
    day: 11
    }
    open: 0.1188
    high: 0.1222
    low: 0.1188
    close: 0.12
    volume: 1219454848
    score: 100
    ]

2.6 Convert the response to DataFrame

Hereafter, we'll outline the process of converting the 'data' into a DataFrame.


#Prepare the data frame content
dates=[datetime(b.date.year, b.date.month, b.date.day) for b in response.data]
opens = [b.open for b in response.data]
highs = [b.high for b in response.data]
lows = [b.low for b in response.data]
closes = [b.close for b in response.data]
volumes = [b.volume for b in response.data]
scores = [b.score for b in response.data]

d = {'Date': dates, 'Open': opens, 'High': highs, 'Low' : lows,'Close': closes, 'Volume': volumes, 'Score': scores }
df = pd.DataFrame(data=d)
df = df.set_index('Date')

2.7 Visualize the data

print(df)

output


Open      High       Low     Close        Volume  Score
Date                                                                   
1984-09-07    0.1183    0.1200    0.1172    0.1183  6.678778e+08  100.0
1984-09-10    0.1183    0.1188    0.1155    0.1177  5.255931e+08  100.0
1984-09-11    0.1188    0.1222    0.1188    0.1200  1.219455e+09  100.0
1984-09-12    0.1200    0.1205    0.1166    0.1166  1.069285e+09  100.0
1984-09-13    0.1228    0.1233    0.1228    0.1228  1.664229e+09  100.0
...              ...       ...       ...       ...           ...    ...
2023-11-17  190.2500  190.3800  188.5700  189.6900  5.094140e+07  100.0
2023-11-20  189.8900  191.9050  189.8800  191.4500  4.653860e+07  100.0
2023-11-21  191.4100  191.5200  189.7400  190.6400  3.813440e+07  100.0
2023-11-22  191.4900  192.9300  190.8250  191.3100  3.963000e+07  100.0
2023-11-24  190.8700  190.9000  189.2500  189.9700  2.404830e+07  100.0

[9883 rows x 6 columns

Conclusion

By combining Ganymede API's capabilities with Python's Pandas DataFrame, you can streamline the process of retrieving, processing, and analyzing market data. This tutorial aims to provide a foundational understanding of leveraging Ganymede API alongside Pandas for efficient and insightful data analysis.

Should you have any further questions, encounter challenges, or wish to explore specific topics in more detail, we remain available to assist. Feel free to reach out for any additional clarification or guidance.

Happy coding, and may your data analysis endeavors be both insightful and rewarding!

article

The role of data in the accuracy of AI models

Just how important is data in the world of AI?

25 January 2023 1 min read

← Back to Blog