Cryptocurrencies continue to draw a lot of attention from investors, entrepreneurs, regulators and the general public. Much recent public discussions of cryptocurrencies have been triggered by the substantial changes in their prices so the most efficient and swift way to keep track of the cryptocurrency market its to make an exploratory analysis.
Cryptocurrencies are digital financial assets, for which records and transfers of ownership are guaranteed by a cryptographic technology rather than a bank or other trusted third party. They can be viewed as financial assets because they bear some value holders, even though they represent no matching liability of any other party and are not backed by any physical asset of value, like gold for example.
The purpose of this project is to analyze a real world scenario: The cryptocurrency market. Whether you are a crypto enthusiast, crypto investor, a day trader or just someone who wants to learn about cryptocurrencies and data visualization.
We will be analyzing and visualizing 12 major cryptocurrencies, which combined together represents almost 90 to 95% of the cryptocurrencies market:
The data set contains historical OHLC Price, which is commonly known as open price, high price, low price and close price, and the volume of sales in terms of USD.
import yfinance as yf
import pandas as pd
import plotly.offline as py
import plotly.graph_objs as go
import plotly.express as px
import plotly.io as pio
from datetime import datetime
start = '2015-10-01'
end = datetime.today().strftime('%Y-%m-%d')
We are going to use yfinane to gather our data. This module offers a reliable, threaded, and Pythonic way to download historical market data from Yahoo! finance by knowing the ticker symbols of your interest. For this analysis we are going to gather data from January 1st, 2016 to this day.
bitc = yf.download('BTC-USD', start=start, end=end)
stel = yf.download('XLM-USD', start=start, end=end)
dash = yf.download('DASH-USD', start=start, end=end)
ripp = yf.download('XRP-USD', start=start, end=end)
teth = yf.download('UsdT-USD', start=start, end=end)
ethe = yf.download('ETH-USD', start=start, end=end)
lite = yf.download('LTC-USD', start=start, end=end)
bcash = yf.download('BCH-USD', start=start, end=end)
bcoin = yf.download('BNB-USD', start=start, end=end)
tron = yf.download('TRX-USD', start=start, end=end)
chai = yf.download('Link-USD', start=start, end=end)
card = yf.download('ADA-USD', start=start, end=end)
print('All stocks are loaded')
[*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed [*********************100%***********************] 1 of 1 completed All stocks are loaded
bitc['Currency'] = 'Bitcoin'
stel['Currency'] = 'Stellar'
dash['Currency'] = 'Dash'
ripp['Currency'] = 'Ripple'
teth['Currency'] = 'Tether'
ethe['Currency'] = 'Ethereum'
lite['Currency'] = 'Litecoin'
bcash['Currency'] = 'Bitcoin Cash'
bcoin['Currency'] = 'Binance Coin'
tron['Currency'] = 'Tron'
chai['Currency'] = 'Chainlink'
card['Currency'] = 'Cardano'
frames = [bitc, stel, dash, ripp, teth, ethe, lite, bcash, bcoin, tron, chai, card]
crypto = pd.concat(frames)
crypto.reset_index(inplace=True)
crypto.head(5)
Date | Open | High | Low | Close | Adj Close | Volume | Currency | |
---|---|---|---|---|---|---|---|---|
0 | 2015-10-01 | 236.003998 | 238.445007 | 235.615997 | 237.548996 | 237.548996 | 20488800 | Bitcoin |
1 | 2015-10-02 | 237.264008 | 238.541000 | 236.602997 | 237.292999 | 237.292999 | 19677900 | Bitcoin |
2 | 2015-10-03 | 237.201996 | 239.315002 | 236.944000 | 238.729996 | 238.729996 | 16482700 | Bitcoin |
3 | 2015-10-04 | 238.531006 | 238.968002 | 237.940002 | 238.259003 | 238.259003 | 12999000 | Bitcoin |
4 | 2015-10-05 | 238.147003 | 240.382996 | 237.035004 | 240.382996 | 240.382996 | 23335900 | Bitcoin |
In this dataframe we have more than 5 years of historical data. The columns are prices open price, high price, low price, close price and adjacent close price. We have a column named Volume, which gives us daily transaction volume in terms of USD.
These values are calculated on 24 hour basis for the price section. We're going to use close price as it is the most accurate representation of price among open, high and low, depending on the finance analysts. Some consider adjacent close price as the most accurate representation, which is particularly true for stocks as it includes calculations that involves dividends and stock splits. However for Cryptocurrency, there is no such thing as dividends. So for visualization porpuses, we're going to use the close price.
crypto.groupby(['Currency']).count()
Date | Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|---|
Currency | |||||||
Binance Coin | 1398 | 1398 | 1398 | 1398 | 1398 | 1398 | 1398 |
Bitcoin | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Bitcoin Cash | 1400 | 1400 | 1400 | 1400 | 1400 | 1400 | 1400 |
Cardano | 1330 | 1330 | 1330 | 1330 | 1330 | 1330 | 1330 |
Chainlink | 1341 | 1341 | 1341 | 1341 | 1341 | 1341 | 1341 |
Dash | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Ethereum | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Litecoin | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Ripple | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Stellar | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Tether | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 | 2061 |
Tron | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 | 1348 |
The above summary shows that half of the currencies have a total count of 2055 rows while others have less. This means that there are younger cryptocurrencies that havent been around in the market like others.
Now let's use boxplots to compare currencies in terms of its price. Remember we will be using close price as it is the best representation of price in this situation. Lets also use the log_y parameter to make visible all currencies, since the major currencies (like bitcoin and Ethereum) have skewed the data.
px.box(crypto, x='Currency', y='Close', title='Cryptocurrency Market Price Comparison', log_y=True)
The top 3 cryptocurrencies respecting to price are Bitcoin, Bitcoin Cash and Ethereum. Smaller coins wouldnt be visible if we havent used the logarithmic view since some of them have a very less price, even sold in pennies.
Lets start by comparing all the currencies with each other In the terms of the volume of transactions, we will use violin charts for this purpose. The reason we're using violin plot is that it provides us with statistical details like median, max, min, Quartile one and quartile three, which would be very useful for financial analysis.
If you are a crypto enthusiast or an investor, you should know that the highest trading Cryptocurrency in terms of volume on average is not Bitcoin.
px.violin(crypto, x='Currency', y='Volume', title='Cryptocurrency Market Volume Comparison <br> Violin Chart')
As you can see, Bitcoin and Tether dominates in terms of volume as a graph clearly shows: the transaction volume of Tether is higher than that of Bitcoin.
This is because Tether is wildly popular in China and East Asia. Tether is pegged at $1 which makes it easiest way to transfer money from one country to another, especially in countries were taking money out of the country is difficult. If you hover to the tail of the cryptocurrencies, it shows you all the statistical value that we need for financial analysis like median, mean, quartiles, min and max values.
The pie chart below can give us the percentage of the market share of each currency.
px.pie(crypto, values='Volume', names='Currency', title='Cryptocurrency Market Volume Comparison <br> Pie Chart')
Tether has the highest volume with 41% followed by Bitcoin with 30% and Ethereum with almost 14%. Therefore, 70% of the cryptocurrency market is dominated by Tether and Bitcoin in terms of volume. Let's go a step further and visualize the entire cryptocurrency market volume in one graph by using a scatterplot.
px.scatter(crypto, x='Close', y='Volume', color='Currency', hover_data=['High', 'Low', 'Date'], log_x=True, log_y=True, height=600, title='Cryptocurrency Market Volume Comparison <br> Scatter Plot')
Similar to the price chart, the use of the log_Y parameter was necesary since bitcoin is the more dominant and makes the other coins invisible by concetrating them in a small area.
As you can see , Bitcoin, which is represented by the blue color at the top right, dominates the entire graph. This implies it is one of the highest price and the highest volume. Apart from Bitcoin, Tether stands out too with its low price, but with one of the highest volume at near $100 billion.
Oher coins that have a good performance are Bitcoin Cash, Ethereum, Dash Litecoin and Chainlink in that order. More volume means more liquidity.
#df = crypto.loc[crypto['Currency']=='Bitcoin']
px.line(df, x='Date', y='Adj Close', title='Bitcoin Historic Price Behavior (2016 - 2021)')
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-10-604b7c01460e> in <module> 1 #df = crypto.loc[crypto['Currency']=='Bitcoin'] ----> 2 px.line(df, x='Date', y='Adj Close', title='Bitcoin Historic Price Behavior (2016 - 2021)') NameError: name 'df' is not defined
In the line chart you can see the price of Bitcoin started to grow untill early 2017, 8 years since its launch. Later it had an explosion in terms of its price. If you hover here to early 2017 you can see that on January 2017 the price of Bitcoin was less than 1,000 USD, however in December 2017 the price of Bitcoin was around 19,000 USD. One year later, again, in 2018 December, the price of Bitcoin was around 3800 USD. By the 4th quarter of 2020 noone was expecting the hugest grow in cryptocurrencies history with a top value of 63,500 USD.
As you can see, the transaction volume of Bitcoin in the chart below has an increasing trend. We have had three major peaks. The first in 2018 when the price of Bitcoin was at $20,000 and the Bitcoin bubble had just burst, leading to millions of people selling the Bitcoin. In early 2020 when the stock market crashed because of the coronavirus lockdown, investors ran towards Cryptocurrency and gold as a hedge against the stock market crash. An then a third one at the firt quarter of 2021 which has the highest point. Ups and downs in terms of volume, but the trend it is still rising.
px.area(df, x='Date', y='Volume', title='Bitcoin Historic Volume Behavior (2016 - 2021)')
Since our data set has OHLC columns referring to open price, high price, low price and close price, we can vizualize this as a candlestick chart.
These small candles represents how Bitcoin performed on that day. The green candles represent days where Bitcoin had a higher close price than the open price that is Bitcoin closed higher for that day, while the red ones are the days with Bitcoin closed lower than it started. This format is very useful to understand the behavior of the stock price.
fig= go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['Open'],
high=df['High'],
low=df['Low'],
close=df['Close'])])
fig.update_layout(title='Bitcoin Price Candlestick (2016 - 2021)')
fig.show()
Using Python we can get quite useful insights from the data analysis and visualizations which might not be clear if we looked the data at finance related websites since these just show charts one by one for each currency.
The cryptocurrency market has always been volatile, therefore a analysis of the historical data can help you to make acurate decisions if you are an investor.
You are very welcome to follow the same analysis and vizualization process for the other 11 coins if you have curiosity. Meaby you can get new insights and make some money.
df = crypto.loc[crypto['Currency']=='Ethereum']
px.line(df, x='Date', y='Adj Close', title='Bitcoin Historic Price Behavior (2016 - 2021)')
df
Date | Open | High | Low | Close | Adj Close | Volume | Currency | |
---|---|---|---|---|---|---|---|---|
10305 | 2015-10-01 | 0.734307 | 0.734307 | 0.655906 | 0.690215 | 0.690215 | 596084 | Ethereum |
10306 | 2015-10-02 | 0.683732 | 0.691120 | 0.654605 | 0.678574 | 0.678574 | 219318 | Ethereum |
10307 | 2015-10-03 | 0.678783 | 0.709204 | 0.675482 | 0.687171 | 0.687171 | 163326 | Ethereum |
10308 | 2015-10-04 | 0.686343 | 0.693126 | 0.660716 | 0.668379 | 0.668379 | 103497 | Ethereum |
10309 | 2015-10-05 | 0.666784 | 0.674438 | 0.624450 | 0.628643 | 0.628643 | 234263 | Ethereum |
... | ... | ... | ... | ... | ... | ... | ... | ... |
12361 | 2021-05-22 | 2436.014648 | 2483.983154 | 2168.124268 | 2295.705566 | 2295.705566 | 42089937660 | Ethereum |
12362 | 2021-05-23 | 2298.367188 | 2384.411621 | 1737.468750 | 2109.579834 | 2109.579834 | 56005721977 | Ethereum |
12363 | 2021-05-24 | 2099.936035 | 2672.595703 | 2090.639648 | 2643.591064 | 2643.591064 | 53697121740 | Ethereum |
12364 | 2021-05-25 | 2649.033203 | 2750.534912 | 2394.355469 | 2706.628906 | 2706.628906 | 49558333256 | Ethereum |
12365 | 2021-05-26 | 2720.290771 | 2911.735596 | 2658.494629 | 2829.011963 | 2829.011963 | 42585575424 | Ethereum |
2061 rows × 8 columns
px.area(df, x='Date', y='Volume', title='Bitcoin Historic Volume Behavior (2016 - 2021)')
fig= go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['Open'],
high=df['High'],
low=df['Low'],
close=df['Close'])])
fig.update_layout(title='Bitcoin Price Candlestick (2016 - 2021)')
fig.show()
fig= go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['Open'],
high=df['High'],
low=df['Low'],
close=df['Close'])])
fig.update_layout(title='Bitcoin Price Candlestick (2016 - 2021)')
fig.show()