Guide to Simple Blockchain Analysis using Dogecoin.info API

Blockchain Panda
6 min readMar 11, 2021

The aim of this article to show how easy it is with a little bit of python to analyse the dogecoin blockchain and hopefully learn something in the process. The dogecoin.info API allows for easy access to dogecoin blockchain data. In this post I will have quick look at submitting requests and analysing the returned data.

Setup

As with all python data science endeavours, first we have to import all libraries we will be using.

import datetime
import io
import json
import pprint
from typing import Union

import matplotlib.pyplot as plt
import pandas as pd
import requests
import seaborn as sns

Request Block Information From Doge Coin

To send a request to dogecoin.info I am using the python requests library. The get_block function below allows us to send a simple https request to get information about an individual block by just specifying the block height. To find the current height we can use get_current_height, which is just a simple http request.

def get_current_height() -> Union[int, None]:
url = "http://dogechain.info/chain/Dogecoin/q/getblockcount"
with requests.get(url) as r:
if r.ok:
return int(r.content)

def get_block(height: int) -> Union[dict, None]:
url = f"https://dogechain.info/api/v1/block/{height}"
with requests.get(url) as r:
if r.ok:
return json.loads(r.content)

Using these functions we can get the current height and retrieve the data for a given block.

current_height = get_current_height()
print(f"Current dogecoin height: {current_height:,}")
pprint.pprint(get_block(current_height))
Current dogecoin height: 3,641,435
{'block': {'average_coin_age': '395.32024690',
'confirmations': 1,
'difficulty': 5357404.901646454,
'hash': '32bce595d10cbea0d7a897c38f9c0a8629df24524aead7478e799c9ea05646e2',
'height': 3641435,
'is_orphan': False,
'merkleroot': '5d07643de6e1784ed5a2ba1169a9b8761116bd12b8f8f59bb5d494c9fccb2083',
'next_block_hash': None,
'nonce': 0,
'num_txs': 14,
'previous_block_hash': '0182961c8c22b153f659936e7c34432031410d45a7e1266e07217a9fe6688eeb',
'time': 1615460006,
'txs': ['eae8fe99edf143b2564b750add70eca6d07381f38a75a1c226a4be1117cdd4a7',
'8f17d5e66025a6331310076d5d8f217b6e52a7aa12f7ea5b5735816fc2097b90',
'67922ff4f84a4cfabf526bf0009e2b06d67cf0427971e351e2e9969cfd1c4e29',
'5d6338526282f741b340fd59964101e9be3d9378bd97ad5ed46cc4c30e6d94b8',
'f4503911cb02335e423639fccf512b40cf3170a579826a2440bfe2432a92a95d',
'4c9009ecf1edb0f964a0f3619cb64c34513dfe4e0843f63168b6cbfb9181ddae',
'ab7b5c984ee8d7752c0098183b5e212aa006471ca153407aeea852dc2cc5267d',
'754c193e93b07b779b18b81066898ebfd1d39571b680f5413618cfe6a7dacfe7',
'61bd926707e7c89095b0e260871105fc9dd1d144bd0dc046a3831d7cf07a5b92',
'68948d6e0b139449a31276c61228a51ad750b221b87b2103df87a6f866be2d55',
'7d819208f26c04fb1d8b3cf2556c0b8b24e845534d4c68d9f326537ace1dfdb3',
'c7fd9604f4f2a276f05b6ed9adb8e1d421d37ae8370d3f47e850545dd7e06ccc',
'd9023bd35020d9978101bacd2173bfeed077bb215a939f620c659267269c55a4',
'31479f8638475c04abbce84871bf0ebaedfa0b33d04f3f3a00e1430896710b5a'],
'value_in': '17542518.49462023',
'value_out': '17552518.49462023',
'version': 6422788},
'success': 1}

There is a lot of data in each block. It is worth noting the data is nested in the json object under the block key.

Sampling Blocks

Dogecoin roughly adds a block every minute and to query for every block would take a long time, so I am going to sample by requesting every 10,000th block. To get all the relevant block I just loop over the block heights and store the results in a dictionary.

sample_every_n_blocks = int(1e4)
sample_block_heights = range(0, current_height, sample_every_n_blocks)
sample_blocks = {height: get_block(height) for height in sample_block_heights}

Convert to Pandas

To further analyse the data I convert the json objects to a pandas dataframe. First I remove any empty responses. Then as I mentioned earlier the actual block data is nested into the json object under the key block. These block data are then passed to pandas as a list of dictionaries.

df = pd.DataFrame.from_records([block.get('block') for block in sample_blocks.values() if block])print(list(df.columns))['hash', 'height', 'previous_block_hash', 'next_block_hash', 'is_orphan', 'difficulty', 'time', 'confirmations', 'merkleroot', 'num_txs', 'value_in', 'value_out', 'version', 'average_coin_age', 'nonce', 'txs']

All the fields are present in the dataframe, but for now I am only going to focus on a few.

df = df[['height', 'num_txs', 'value_in', 'value_out', 'average_coin_age', 'time', 'difficulty']]
df.tail()
png

Handling Time

The creation time of each block is in the time column as a timestamp. This can easily be converted to a python datetime. Given the time between blocks is around 7 days, I have just rounded to the nearest date to simplify further analysis.

df.loc[:,'time'] = pd.to_datetime(df.loc[:,'time'], unit='s').dt.round('1D')
df = df.set_index('time')
df.tail()
png

Price Data

As part of most blockchain analysis we probably want to include the value of doge against the US dollar. There are many APIs and websites that provide price data but for simplicity I use yahoo finance. By simply specifying the currency pair (in this case DOGE/USD) and the start and stop times you can get the open, high, low and close prices trivially.

By using the requests library again we can get pull the prices. By using the min and max times of our blockchain data we can request the relevant timespan from yahoo as well. The yahoo finance data is returned as bytes and we can use io.BytesIO to push the data into pandas.

def get_doge_price_data(start_timestamp: int, end_timestamp: int) -> Union[bytes, None]:

url = f"https://query1.finance.yahoo.com/v7/finance/download/DOGE-USD?period1={start_timestamp}&period2={end_timestamp}&interval=1d&events=history"
with requests.get(url) as r:
if r.ok:
return r.content
price_data = get_doge_price_data(int(df.index.min().timestamp()), int(df.index.max().timestamp()))
df_price = pd.read_csv(io.BytesIO(price_data), parse_dates=['Date'])
df_price.head()
png

Joining Price Data and Blockchain Data

We now have two dataframes: one of blockchain data and one of prices, but it would be better if we could combine them. Given they both contain a date column we can use this for the joining.

df = pd.merge(df, df_price[['Date','Close']], left_index=True, right_on='Date')
df.set_index('Date', inplace=True)
df.tail()
png

Correlations

Now we have a rather our dataframe will all the data of interest lets have a quick look at correlations between columns. For this I am using pandas corr() functionality and the seaborn heatmap. Before the correlation a few of the columns need to be converted to a numeric data type. This goes back to the format of the json returned by dogecoin.info as some of the numeric values are returned as strings.

df['value_in'] = pd.to_numeric(df['value_in'])
df['value_out'] = pd.to_numeric(df['value_out'])
df['average_coin_age'] = pd.to_numeric(df['average_coin_age'])
fig, ax = plt.subplots(figsize=(12,6))
sns.heatmap(df.corr(), annot=True, ax=ax);
png

Unsurprisingly the value_in and value_out are perfectly correlated. Aside from this the strongest correlation found is between blockchain height and difficulty.

Time Series

Lets have a look at how difficulty varies over time. Above we used height, but given the near-constant rate of block creations we can switch to time and in the process get a more interpretable feature.

Below we have made of pandas and matplotlib to visualise the DOGE/USD price and the difficulty. To smooth out the spikes we have used the rolling pandas method to calculate a rolling mean. As doge spent most of its initial life at a very low price and in the last couple of years has jumped significantly we have a very wide range of values to represent. To achieve we have a log y-scale.

fig, (ax1, ax2) = plt.subplots(figsize=(12,12), nrows=2)

df['Close'].plot(ax=ax1, label='Close')
df['Close'].rolling(7).mean().plot(ax=ax1, label='rolling mean Close')
ax1.set_ylabel('Closing price DOGE / USD')
ax1.set_yscale('log')
ax1.legend()

df['difficulty'].plot(ax=ax2, label='difficulty')
df['difficulty'].rolling(7).mean().plot(ax=ax2, label='rolling mean difficulty')
ax2.set_ylabel('difficulty')
ax2.set_yscale('log')
ax2.legend();
png

Both plots show initial low and rather flat values between 2015 and 2017. Then with a step-change in both. In the last few months we have another step-change in price, but have yet to see the same in difficulty. It would be interesting to recreate this plot in a couple of months to see if difficulty increases and whether miners have moved to doge.

Conclusion

With just a little bit of python we have pulled blockchain data and price data from two APIs. We have then converted this raw data into something usable (pandas dataframe. Then using some simple visualisations maybe found out some new information about the dogecoin blockchain!

This article is available as a jupyter notebook on github.

--

--