COVID-19 (Coronavirus Disease 2019) is a respiratory disease that occurs because of a coronavirus named SARA-CoV-2. Coronaviruses are a group of RNA viruses which causes respiratory infections ranging from common cold to severe lung infection.
The COVID-19 pandemic, also referred to as the coronavirus pandemic, is an ongoing pandemic of coronavirus disease 2019 (COVID-19. The outbreak was first identified in Wuhan, China, in December 2019. The World Health Organization declared the outbreak a public health emergency of international concern on 30 January 2020, and a pandemic on 11 March 2020.
The virus is primarily spread between people during close contact, most frequently via small droplets produced by coughing, sneezing, and talking. The droplets usually fall to the bottom or onto surfaces instead of travelling through air over long distances. Less commonly, people may become infected by touching a contaminated surface and then touching their face. It is most contagious during the first three days after the onset of symptoms, although spread is feasible before symptoms appear, and from people who do not show any symptom.
Common symptoms include fever, cough, fatigue, shortness of breath, and loss of sense of smell. Complications may include pneumonia and acute respiratory distress syndrome. The time from exposure to onset of symptoms is usually around five days but may range from two to fourteen days. There is no known vaccine or specific antiviral treatment. Primary treatment is symptomatic and supportive therapy.
Recommended preventive measures include hand washing, covering one's mouth when coughing, maintaining distance from other people, wearing a mask in public settings, monitoring and self-isolation for people who suspect they are infected. Authorities worldwide have responded by implementing travel restrictions, lockdowns, workplace hazard controls, and facility closures. Many places have also worked to increase testing capacity and trace contacts of infected persons.
from IPython.display import *
Image("https://media.foxbusiness.com/BrightCove/854081161001/202003/2652/854081161001_6141155653001_6141149610001-vs.jpg")
import warnings
warnings.filterwarnings("ignore")
import json
token = {"username":"abshkpskr","key":"7bf37d08aafdc81ecb2640a1960fddbc"}
with open('kaggle.json', 'w') as file:
json.dump(token, file)
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d sudalairajkumar/novel-corona-virus-2019-dataset
!kaggle datasets download -d winterpierre91/covid19-global-weather-data
from zipfile import ZipFile
ZipFile('novel-corona-virus-2019-dataset.zip').extractall()
ZipFile('covid19-global-weather-data.zip').extractall()
import os
for i in sorted(os.listdir()):
print(i)
#libraries for data management
import numpy as np
import pandas as pd
#libraries for visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly
import plotly.express as px
import plotly.graph_objs as go
import folium
from google.colab import files
from IPython.display import *
# plotly.io.renderers.default = 'colab'
plotly.offline.init_notebook_mode(connected=True)
temptr_data = pd.read_csv('temperature_dataframe.csv')
data = pd.read_csv('covid_19_data.csv',index_col = 'SNo')
case_confirmed = pd.read_csv('time_series_covid_19_confirmed.csv')
case_deaths = pd.read_csv('time_series_covid_19_deaths.csv')
case_recovered = pd.read_csv('time_series_covid_19_recovered.csv')
data = data.drop('Last Update',axis = 1)
data = data.fillna('unknown')
case_data = pd.DataFrame()
country_state = data.groupby(['Country/Region','Province/State']).size().reset_index(name = 'a').drop('a',1)
for country,state in country_state.values:
temp = data[(data['Country/Region'] == country) & (data['Province/State'] == state)]
temp1 = pd.DataFrame()
temp1 = temp1.append(temp.iloc[0:1,:],ignore_index=True)
temp1['new_confirmed'] = 0.0
temp1['new_deaths'] = 0.0
temp1['new_recovered'] = 0.0
for i in range(1,len(temp)):
date,sta,ctr,confirmed,deaths,recovered = temp.iloc[i].values
new_confirmed = temp.iloc[i,3] - temp.iloc[i-1,3]
new_deaths = temp.iloc[i,4] - temp.iloc[i-1,4]
new_recovered = temp.iloc[i,5] - temp.iloc[i-1,5]
if new_confirmed < 0:
confirmed = temp.iloc[i-1][3]
new_confirmed = 0
if new_deaths < 0:
deaths = temp.iloc[i-1][4]
new_deaths = 0
if new_recovered < 0:
recovered = temp.iloc[i-1][5]
new_recovered = 0
temp1.loc[i] = [date,sta,ctr,confirmed,deaths,recovered,new_confirmed,new_deaths,new_recovered]
case_data = pd.concat([case_data,temp1],ignore_index=True)
#case_data
case_data['Active'] = case_data['Confirmed'] - (case_data['Deaths'] + case_data['Recovered'])
case_data['ObservationDate'] = pd.to_datetime(case_data['ObservationDate'])
case_data['Confirmed'] = case_data['Confirmed'].astype('int')
case_data['Deaths'] = case_data['Deaths'].astype('int')
case_data['Recovered'] = case_data['Recovered'].astype('int')
case_data['Active'] = case_data['Active'].astype('int')
case_data['new_confirmed'] = case_data['new_confirmed'].astype('int')
case_data['new_deaths'] = case_data['new_deaths'].astype('int')
case_data['new_recovered'] = case_data['new_recovered'].astype('int')
case_data.loc[(case_data['Country/Region'] == ' Azerbaijan'),'Country/Region'] = 'Azerbaijan'
case_data.loc[(case_data['Country/Region'] == 'US'),'Country/Region'] = 'United States'
case_data.loc[(case_data['Country/Region'] == "('St. Martin',)"),'Country/Region'] = 'St Martin'
case_data.loc[(case_data['Country/Region'] == "UK"),'Country/Region'] = 'United Kingdom'
case_data.loc[(case_data['Country/Region'] == "Bahamas, The"),'Country/Region'] = 'Bahamas'
Data analysis is done here to study the trends of spread of virus in different countries. Most of the countries have started showing a flat curve.
country_case_data = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
country_case_data = country_case_data[country_case_data['ObservationDate'] == max(country_case_data['ObservationDate'].values)]
country_case_data = country_case_data.drop('ObservationDate',axis = 1)
country_case_data = country_case_data.set_index('Country/Region')
country_case_data = country_case_data[['Confirmed','new_confirmed','Deaths','new_deaths',
'Recovered','new_recovered','Active']]
country_case_data['Mortality Rate'] = (country_case_data['Deaths']/country_case_data['Confirmed'])*100
country_case_data = country_case_data.rename(columns={'Confirmed':'| Confirmed |','new_confirmed':'| new_confirmed |',
'Deaths':'| Deaths |','new_deaths':'| new_deaths |',
'Recovered':'| Recovered |','new_recovered':'| new_recovered |',
'Active':'| Active |','Mortality Rate':'| Mortality Rate |'})
country_case_data.sort_values('| Confirmed |', ascending= False).style\
.background_gradient(cmap='binary',subset=["| Confirmed |"])\
.background_gradient(cmap='Blues',subset=["| new_confirmed |"])\
.background_gradient(cmap='binary',subset=["| Deaths |"])\
.background_gradient(cmap='Reds',subset=["| new_deaths |"])\
.background_gradient(cmap='binary',subset=["| Recovered |"])\
.background_gradient(cmap='Greens',subset=["| new_recovered |"])\
.background_gradient(cmap='Purples',subset=["| Active |"])\
.background_gradient(cmap='YlOrBr',subset=["| Mortality Rate |"])
world_cumulative = pd.DataFrame(case_data.groupby(['ObservationDate']).sum()).reset_index()
fig = go.Figure()
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Confirmed'],mode='lines',name='Confirmed',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Deaths'],mode='lines',name='Deaths',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Recovered'],mode='lines',name='Recovered',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Active'],mode='lines',name='Active',line=dict( width=4)))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
paper_bgcolor="white",height= 600,
legend=dict(x=.01,y=.98),
title_text = 'Number of COVID-19 cases worldwide',font_size=15,
xaxis_title="Date",
yaxis_title="Number of cases",)
fig.layout.hovermode = 'x'
fig.show()
world_daily_cumulative = pd.DataFrame(case_data.groupby(['ObservationDate']).sum())
fig = go.Figure()
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_confirmed'],mode='lines',name='Confirmed',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_deaths'],mode='lines',name='Deaths',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_recovered'],mode='lines',name='Recovered',line=dict( width=4)))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
paper_bgcolor="white",height= 600,
legend=dict(x=.01,y=.98),
title_text = 'Number of daily COVID-19 cases worldwide',font_size=15,
xaxis_title="Date",
yaxis_title="Number of new cases",)
fig.layout.hovermode = 'x'
fig.show()
def CreateMap(data,color,fill_color):
_map = folium.Map(location=[10,5], tiles="Stamen Toner", zoom_start=2.3)
for name,cases,deaths,lat,lon in data.values:
folium.CircleMarker([lat,lon],radius=((int(np.log(cases + 1)))*1.5),color=color,fill_color=fill_color,
tooltip = "<h5 style='text-align:center;font-weight: bold'>"+ name +"</h5>"+
"<hr style='margin:10px;'>"+
"<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
"<li>Confirmed: "+str(cases)+"</li>"+
"<li>Deaths: "+str(deaths)+"</li>"+
"</ul>",fill_opacity=0.7).add_to(_map)
return _map
df_map_plot = pd.DataFrame()
for i in case_confirmed.index:
if pd.isna(case_confirmed.loc[i,'Province/State']):
df_map_plot.loc[i,'location'] = case_confirmed.loc[i,'Country/Region']
else:
df_map_plot.loc[i,'location'] = str(case_confirmed.loc[i,'Province/State']) + ", " + case_confirmed.loc[i,'Country/Region']
if case_confirmed.loc[i,case_confirmed.columns[-1]] < 0 :
df_map_plot.loc[i,'cases'] = 1
else:
df_map_plot.loc[i,'cases'] = case_confirmed.loc[i,case_confirmed.columns[-1]]
df_map_plot.loc[i,'deaths'] = case_deaths.loc[i,case_deaths.columns[-1]]
df_map_plot.loc[i,'Lat'] = str(case_confirmed.loc[i,'Lat'])
df_map_plot.loc[i,'Long'] = str(case_confirmed.loc[i,'Long'])
df_map_plot['Lat'] = df_map_plot['Lat'].astype('float')
df_map_plot['Long'] = df_map_plot['Long'].astype('float')
df_map_plot['cases'] = df_map_plot['cases'].astype('int')
df_map_plot['deaths'] = df_map_plot['deaths'].astype('int')
CreateMap(df_map_plot,'#022474','#4D76D7')
temp_df = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index().sort_values(['ObservationDate'])
temp_df['ObservationDate'] = temp_df['ObservationDate'].astype('str')
fig = px.choropleth(temp_df, locations="Country/Region",
color=np.log10(temp_df["Confirmed"]),
hover_name="Country/Region",
hover_data=["Confirmed",'Deaths','Recovered'],
color_continuous_scale=px.colors.sequential.thermal_r,
locationmode="country names",
animation_frame='ObservationDate',color_continuous_midpoint = 3)
fig.update_layout(margin=dict(l=20,r=0,b=0,t=70,pad=0),
paper_bgcolor="white",
height= 700,
title_text = 'Number of daily COVID-19 cases worldwide',font_size=18)
fig.show()
case_confirmed_racemap = pd.DataFrame(case_confirmed.groupby('Country/Region').sum()).reset_index()
case_confirmed_racemap.loc[case_confirmed_racemap['Country/Region'] == 'US','Country/Region'] = 'United States'
case_confirmed_racemap.loc[case_confirmed_racemap['Country/Region'] == 'Korea, South','Country/Region'] = 'South Korea'
for col in case_confirmed_racemap.columns[3:]:
case_confirmed_racemap.rename(columns = {col:str(pd.to_datetime(col))[0:10]},inplace = True)
flags = pd.read_csv('https://raw.githubusercontent.com/AbshkPskr/Portfolio/master/Country_Flags.csv')
flags = flags.drop('Images File Name',1).rename(columns={'Country':'Country/Region'})
confirmed_with_flag = case_confirmed_racemap.merge(flags,on='Country/Region',how='left')
confirmed_with_flag.to_csv('country.csv')
files.download('country.csv')
%%HTML
<div class="flourish-embed flourish-bar-chart-race" data-src="visualisation/2059674" data-url="https://flo.uri.sh/visualisation/2059674/embed"><script src="https://public.flourish.studio/resources/embed.js"></script></div>
Countries with more than 1 lakh cases
case_country = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
countries = case_country[case_country['Confirmed'] > 80000]['Country/Region'].unique()
def CreateComparisonPlot(attr,title,x_title,y_title, y_axis_type = None, exclude = 'United States'):
fig = go.Figure()
for i in countries:
if i == exclude: continue
if exclude == None :
one_country = case_country[(case_country['Country/Region'] == i) & (case_country[attr] > 0)][['ObservationDate',attr]]
if y_axis_type == 'log': one_country['ObservationDate'] = [i for i in range(1,len(one_country)+1)]
else:
one_country = case_country[(case_country['Country/Region'] == i)][['ObservationDate',attr]]
fig.add_trace(go.Scatter(x=one_country['ObservationDate'],
y=one_country[attr],mode='lines',
name=i,line=dict( width=4)))
# fig.add_annotation(text="First case of United states",
# x='2020-01-23', y=1, arrowhead=0, showarrow=True)
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
paper_bgcolor="white",height= 600,
legend=dict(x=.01,y=.98,font=dict(size =12)),
title_text = title,font_size=15,
xaxis_title=x_title,
yaxis_title=y_title)
if y_axis_type == 'log' :
fig.update_yaxes(type="log")
fig.update_layout(legend=dict(x=.85,y=.02,font=dict(size =12)))
fig.layout.hovermode = 'x'
fig.show()
CreateComparisonPlot('Confirmed','Number of confirmed cases for most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('Deaths','Number of Deaths in most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('Recovered','Number of Recovered cases in most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('new_confirmed','Number of Daily Confirmed cases in most affected countries','Date','Number of cases',None,None)
CreateComparisonPlot('new_deaths','Number of Daily Death cases in most affected countries','Date','Number of cases',None,None)
CreateComparisonPlot('Confirmed','Number of confirmed cases for most affected countries',
'Day','Number of cases (log scale)','log',None)
CreateComparisonPlot('Deaths','Number of confirmed cases for most affected countries',
'Day','Number of cases (log scale)','log',None)
CreateComparisonPlot('Recovered','Number of confirmed cases for most affected countries',
'Day','Number of cases (log scale)','log',None)
case_country = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
case_country['ObservationDate'] = case_country['ObservationDate'].astype('str')
countries = case_country[case_country['Confirmed'] > 200].sort_values(['Confirmed'],ascending=False)['Country/Region'].unique()
# row = int((len(countries)+1)/3)
rows = len(countries)
columns = 2
f = plt.figure(figsize=(20,rows*5))
gs = f.add_gridspec(rows,columns)
sns.set(style = "whitegrid")
country = 0
for i in range(0,rows):
data = case_country[case_country['Country/Region'] == countries[country]]
data = data[data['Confirmed'] > 0]
data['day'] = [i for i in range(1,len(data)+1)]
for j in range(0,columns):
#if country == len(countries): break
ax = f.add_subplot(gs[i,j])
if j == 0:
sns.scatterplot(data = data,x = 'day',y = 'Confirmed',s = 50,color="#4348C4",edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'Confirmed', color = '#4348C4')
plt.fill_between(data['day'], data['Confirmed'], alpha=0.30, color = '#4348C4')
sns.scatterplot(data = data,x = 'day',y = 'Recovered',s= 50,color="#5BAC4D",edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'Recovered',color="#5BAC4D")
plt.fill_between(data['day'], data['Recovered'],alpha=0.30,color="#5BAC4D")
sns.scatterplot(data = data,x = 'day',y = 'Deaths',s= 50,color='#BB3535',edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'Deaths',color='#BB3535')
plt.fill_between(data['day'], data['Deaths'],alpha=0.30,color='#BB3535')
ax.legend(['Confirmed','Recovered','Deaths'], fontsize=16,loc = 'upper left')
ax.set_xlabel('Day',fontsize=20)
ax.set_ylabel('No. of cases',fontsize=20)
ax.set_title(countries[country], fontdict={'fontsize': 25, 'weight' : 'bold'}, color="black",loc = 'left')
text = str(data['ObservationDate'].values[0]) + " - " + str(data['ObservationDate'].values[-1]) + "\n" + "\n"
text += 'Total Cases :-' + "\n"
text += 'Confirmed :' + str(data['Confirmed'].values[-1]) + "\n"
text += 'Deaths :' + str(data['Deaths'].values[-1]) + "\n"
text += 'Recovered :' + str(data['Recovered'].values[-1]) + "\n" + "\n"
ax.text(0.02, 0.15, text, fontsize=15, transform=ax.transAxes,bbox=dict(facecolor='#F8F8F8', alpha=.7))
if j == 1:
sns.scatterplot(data = data,x = 'day',y = 'new_confirmed',s = 50,color="#4348C4",edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'new_confirmed', color = '#4348C4')
plt.fill_between(data['day'], data['new_confirmed'], alpha=0.30, color = '#4348C4')
sns.scatterplot(data = data,x = 'day',y = 'new_recovered',s= 50,color="#5BAC4D",edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'new_recovered',color="#5BAC4D")
plt.fill_between(data['day'], data['new_recovered'],alpha=0.30,color="#5BAC4D")
sns.scatterplot(data = data,x = 'day',y = 'new_deaths',s= 50,color='#BB3535',edgecolor = 'none')
sns.lineplot(data = data,x = 'day',y = 'new_deaths',color='#BB3535')
plt.fill_between(data['day'], data['new_deaths'],alpha=0.30,color='#BB3535')
ax.legend(['Daily Confirmed','Daily Recovered','Daily Deaths'], fontsize=16,loc = 'upper left')
ax.set_xlabel('Day',fontsize=20)
ax.set_ylabel('No. of cases',fontsize=20)
text = 'In last 24 hrs :-' + "\n"
text += 'Confirmed :' + str(data['new_confirmed'].values[-1]) + "\n"
text += 'Deaths :' + str(data['new_deaths'].values[-1]) + "\n"
text += 'Recovered :' + str(data['new_recovered'].values[-1])
ax.text(0.02, 0.3, text, fontsize=15, transform=ax.transAxes,bbox=dict(facecolor='#F8F8F8', alpha=.7))
country += 1
f.tight_layout()