Analysis of COVID-19 pandemic

COVID-19 (Coronavirus Disease 2019) is a respiratory disease that occurs because of a coronavirus named SARA-CoV-2. Coronaviruses are a group of RNA viruses which causes respiratory infections ranging from common cold to severe lung infection.

The COVID-19 pandemic, also referred to as the coronavirus pandemic, is an ongoing pandemic of coronavirus disease 2019 (COVID-19. The outbreak was first identified in Wuhan, China, in December 2019. The World Health Organization declared the outbreak a public health emergency of international concern on 30 January 2020, and a pandemic on 11 March 2020.

The virus is primarily spread between people during close contact, most frequently via small droplets produced by coughing, sneezing, and talking. The droplets usually fall to the bottom or onto surfaces instead of travelling through air over long distances. Less commonly, people may become infected by touching a contaminated surface and then touching their face. It is most contagious during the first three days after the onset of symptoms, although spread is feasible before symptoms appear, and from people who do not show any symptom.

Common symptoms include fever, cough, fatigue, shortness of breath, and loss of sense of smell. Complications may include pneumonia and acute respiratory distress syndrome. The time from exposure to onset of symptoms is usually around five days but may range from two to fourteen days. There is no known vaccine or specific antiviral treatment. Primary treatment is symptomatic and supportive therapy.

Recommended preventive measures include hand washing, covering one's mouth when coughing, maintaining distance from other people, wearing a mask in public settings, monitoring and self-isolation for people who suspect they are infected. Authorities worldwide have responded by implementing travel restrictions, lockdowns, workplace hazard controls, and facility closures. Many places have also worked to increase testing capacity and trace contacts of infected persons.

from IPython.display import * 
Image("https://media.foxbusiness.com/BrightCove/854081161001/202003/2652/854081161001_6141155653001_6141149610001-vs.jpg")

Data collection and preprocessing

import warnings
warnings.filterwarnings("ignore")
import json
token = {"username":"abshkpskr","key":"7bf37d08aafdc81ecb2640a1960fddbc"}
with open('kaggle.json', 'w') as file:
    json.dump(token, file)

!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d sudalairajkumar/novel-corona-virus-2019-dataset
!kaggle datasets download -d winterpierre91/covid19-global-weather-data
from zipfile import ZipFile
ZipFile('novel-corona-virus-2019-dataset.zip').extractall()
ZipFile('covid19-global-weather-data.zip').extractall()

import os
for i in sorted(os.listdir()):
    print(i)
#libraries for data management
import numpy as np
import pandas as pd

#libraries for visualization
import matplotlib.pyplot as plt
import seaborn as sns
import plotly
import plotly.express as px
import plotly.graph_objs as go
import folium 

from google.colab import files
from IPython.display import *

# plotly.io.renderers.default = 'colab'
plotly.offline.init_notebook_mode(connected=True)
temptr_data = pd.read_csv('temperature_dataframe.csv')
data = pd.read_csv('covid_19_data.csv',index_col = 'SNo')
case_confirmed = pd.read_csv('time_series_covid_19_confirmed.csv')
case_deaths = pd.read_csv('time_series_covid_19_deaths.csv')
case_recovered = pd.read_csv('time_series_covid_19_recovered.csv')
data = data.drop('Last Update',axis = 1)
data = data.fillna('unknown')
case_data = pd.DataFrame()

country_state = data.groupby(['Country/Region','Province/State']).size().reset_index(name = 'a').drop('a',1)

for country,state in country_state.values:
    temp = data[(data['Country/Region'] == country) & (data['Province/State'] == state)]
    temp1 = pd.DataFrame()
    temp1  = temp1.append(temp.iloc[0:1,:],ignore_index=True)
    temp1['new_confirmed'] = 0.0
    temp1['new_deaths'] = 0.0
    temp1['new_recovered'] = 0.0
    for i in range(1,len(temp)):
        date,sta,ctr,confirmed,deaths,recovered = temp.iloc[i].values
        new_confirmed = temp.iloc[i,3] - temp.iloc[i-1,3]
        new_deaths = temp.iloc[i,4] - temp.iloc[i-1,4]
        new_recovered = temp.iloc[i,5] - temp.iloc[i-1,5]
        if new_confirmed < 0:
            confirmed = temp.iloc[i-1][3]
            new_confirmed = 0
        if new_deaths < 0:
            deaths = temp.iloc[i-1][4]
            new_deaths = 0
        if new_recovered < 0:
            recovered = temp.iloc[i-1][5]
            new_recovered = 0
        temp1.loc[i] = [date,sta,ctr,confirmed,deaths,recovered,new_confirmed,new_deaths,new_recovered]
    case_data = pd.concat([case_data,temp1],ignore_index=True)

#case_data
case_data['Active'] = case_data['Confirmed'] - (case_data['Deaths'] + case_data['Recovered'])
case_data['ObservationDate'] = pd.to_datetime(case_data['ObservationDate'])
case_data['Confirmed']  = case_data['Confirmed'].astype('int')
case_data['Deaths']  = case_data['Deaths'].astype('int')
case_data['Recovered']  = case_data['Recovered'].astype('int')
case_data['Active']  = case_data['Active'].astype('int')
case_data['new_confirmed']  = case_data['new_confirmed'].astype('int')
case_data['new_deaths']  = case_data['new_deaths'].astype('int')
case_data['new_recovered']  = case_data['new_recovered'].astype('int')
case_data.loc[(case_data['Country/Region'] == ' Azerbaijan'),'Country/Region'] = 'Azerbaijan'
case_data.loc[(case_data['Country/Region'] == 'US'),'Country/Region'] = 'United States'
case_data.loc[(case_data['Country/Region'] == "('St. Martin',)"),'Country/Region'] = 'St Martin'
case_data.loc[(case_data['Country/Region'] == "UK"),'Country/Region'] = 'United Kingdom'
case_data.loc[(case_data['Country/Region'] == "Bahamas, The"),'Country/Region'] = 'Bahamas'

Data analysis

Data analysis is done here to study the trends of spread of virus in different countries. Most of the countries have started showing a flat curve.

General Analysis

country_case_data = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
country_case_data = country_case_data[country_case_data['ObservationDate'] == max(country_case_data['ObservationDate'].values)]
country_case_data = country_case_data.drop('ObservationDate',axis = 1)
country_case_data = country_case_data.set_index('Country/Region')
country_case_data = country_case_data[['Confirmed','new_confirmed','Deaths','new_deaths',
                                       'Recovered','new_recovered','Active']]
country_case_data['Mortality Rate'] =  (country_case_data['Deaths']/country_case_data['Confirmed'])*100                          
country_case_data = country_case_data.rename(columns={'Confirmed':'|  Confirmed  |','new_confirmed':'|  new_confirmed  |',
                                                      'Deaths':'|  Deaths  |','new_deaths':'|  new_deaths  |',
                                                      'Recovered':'|  Recovered  |','new_recovered':'|  new_recovered  |',
                                                      'Active':'|  Active  |','Mortality Rate':'|  Mortality Rate  |'})                          

country_case_data.sort_values('|  Confirmed  |', ascending= False).style\
.background_gradient(cmap='binary',subset=["|  Confirmed  |"])\
.background_gradient(cmap='Blues',subset=["|  new_confirmed  |"])\
.background_gradient(cmap='binary',subset=["|  Deaths  |"])\
.background_gradient(cmap='Reds',subset=["|  new_deaths  |"])\
.background_gradient(cmap='binary',subset=["|  Recovered  |"])\
.background_gradient(cmap='Greens',subset=["|  new_recovered  |"])\
.background_gradient(cmap='Purples',subset=["|  Active  |"])\
.background_gradient(cmap='YlOrBr',subset=["|  Mortality Rate  |"])
| Confirmed | | new_confirmed | | Deaths | | new_deaths | | Recovered | | new_recovered | | Active | | Mortality Rate |
Country/Region
United States 3054699 58601 132300 820 953420 16944 1968979 4.331032
Brazil 1713160 44571 67964 1223 1139844 32832 505352 3.967172
India 767929 25512 21129 487 476378 19547 270422 2.751426
Russia 699749 6534 10650 172 471718 8615 217381 1.521974
Peru 312911 3633 11133 181 204748 3810 97030 3.557881
Chile 303083 2064 6573 139 271741 3490 24769 2.168713
United Kingdom 288511 637 44602 126 1378 3 242531 15.459376
Mexico 275003 6995 32796 782 214316 4879 27891 11.925688
Spain 252513 383 28396 4 150376 0 73741 11.245362
Iran 248379 2691 12084 153 209463 2463 26832 4.865146
Italy 242149 193 34914 15 193646 831 13589 14.418395
Pakistan 240848 3359 4983 61 145311 4346 90554 2.068940
South Africa 224665 8810 3602 100 106842 4543 114221 1.603276
Saudi Arabia 220144 3036 2059 42 158050 3211 60035 0.935297
Turkey 208938 1041 5282 22 187511 2219 16145 2.528023
France 206365 293 29937 1 78010 230 98418 14.506820
Germany 198757 414 9046 14 183153 492 6558 4.551286
Bangladesh 172134 3489 2197 46 80838 2736 89099 1.276331
Colombia 124495 4214 4606 154 51861 1491 68028 3.699747
Canada 108334 311 8786 21 71805 387 27743 8.110104
Qatar 101553 608 138 4 96107 1204 5308 0.135890
Argentina 87030 3604 1694 50 36502 6407 48834 1.946455
Mainland China 83581 9 4634 0 78590 42 357 5.544322
Egypt 78304 1025 3564 75 22241 523 52499 4.551492
Sweden 73859 515 5482 35 0 0 68377 7.422251
Indonesia 68079 1853 3359 50 31585 800 33135 4.933974
Iraq 67442 2741 2779 94 37879 1627 26784 4.120578
Belarus 64224 221 443 7 52854 952 10927 0.689773
Ecuador 63245 0 4873 0 29071 0 29301 7.704957
Belgium 62123 65 9776 2 17138 16 35209 15.736523
United Arab Emirates 53045 445 327 1 42282 568 10436 0.616458
Kuwait 52007 762 379 2 42108 593 9520 0.728748
Ukraine 51457 835 1323 24 23912 942 26222 2.571079
Kazakhstan 51059 0 264 0 16298 0 34497 0.517049
Netherlands 50959 52 6154 3 187 0 44618 12.076375
Philippines 50359 2486 1314 5 12588 202 36457 2.609265
Oman 50207 1210 233 9 32005 1005 17969 0.464079
Singapore 45298 158 26 0 41323 321 3949 0.057398
Portugal 44859 443 1631 2 29714 269 13514 3.635837
Bolivia 42984 1439 1577 47 12883 485 28524 3.668807
Panama 41251 960 819 20 19469 743 20963 1.985406
Dominican Republic 39588 1158 829 8 20056 492 18703 2.094069
Poland 36689 277 1542 14 24878 640 10269 4.202895
Afghanistan 33594 210 936 16 20700 521 11958 2.786212
Israel 33557 1335 344 2 18338 111 14875 1.025121
Switzerland 32498 129 1966 0 29400 100 1132 6.049603
Bahrain 30931 610 98 0 26073 503 4760 0.316834
Nigeria 30249 460 684 15 12373 265 17192 2.261232
Romania 30175 555 1817 18 20799 265 7559 6.021541
Armenia 29820 535 521 18 17427 520 11872 1.747150
Honduras 25978 550 694 17 2721 84 22563 2.671491
Ireland 25542 4 1742 0 23364 0 436 6.820139
Guatemala 25411 624 1053 49 3718 143 20640 4.143875
Ghana 22822 854 129 0 17564 408 5129 0.565244
Azerbaijan 21916 542 274 9 13100 465 8542 1.250228
Japan 20261 206 982 1 17057 0 2222 4.846750
Austria 18513 92 706 0 16721 35 1086 3.813536
Moldova 18471 330 614 11 11549 308 6308 3.324130
Algeria 17348 469 978 10 12329 235 4041 5.637537
Serbia 17076 357 341 11 13562 115 3173 1.996955
Nepal 16423 255 35 0 7752 253 8636 0.213116
Cameroon 14916 0 359 0 11525 0 3032 2.406811
Morocco 14771 164 242 2 11316 677 3213 1.638345
South Korea 13293 49 287 2 12019 49 987 2.159031
Denmark 13101 12 609 0 12202 18 290 4.648500
Czech Republic 12814 129 351 0 8010 100 4453 2.739192
Ivory Coast 11504 310 78 2 5571 84 5855 0.678025
Uzbekistan 11092 254 45 4 7060 249 3987 0.405698
Sudan 10084 87 636 14 5074 40 4374 6.307021
Australia 9056 170 106 0 7573 86 1377 1.170495
Norway 8950 3 251 0 8138 0 561 2.804469
Kyrgyzstan 8847 568 116 9 3053 86 5678 1.311179
Malaysia 8677 3 121 0 8486 5 70 1.394491
El Salvador 8566 259 235 6 5133 204 3198 2.743404
Kenya 8528 278 169 2 2593 89 5766 1.981707
Venezuela 8008 315 75 4 2100 0 5833 0.936563
Senegal 7657 110 141 4 5097 74 2419 1.841452
Congo (Kinshasa) 7432 0 182 0 3226 0 4024 2.448870
North Macedonia 7406 162 359 8 3554 230 3493 4.847421
Finland 7265 3 329 0 6800 100 136 4.528562
Ethiopia 6774 928 120 17 2430 0 4224 1.771479
Haiti 6486 54 123 6 2181 101 4182 1.896392
Tajikistan 6364 49 54 1 5011 46 1299 0.848523
Bulgaria 6342 240 259 5 3166 129 2917 4.083885
Gabon 5871 128 46 0 2682 108 3143 0.783512
Bosnia and Herzegovina 5869 248 209 2 2769 76 2891 3.561084
Costa Rica 5836 350 25 2 1929 119 3882 0.428376
Guinea 5697 61 34 0 4577 35 1086 0.596805
Mauritania 5087 63 139 4 1994 50 2954 2.732455
West Bank and Gaza 5029 382 20 2 494 0 4515 0.397693
Djibouti 4889 11 55 0 4644 23 190 1.124974
Luxembourg 4650 47 110 0 4056 0 484 2.365591
Hungary 4210 5 589 0 2885 11 736 13.990499
Central African Republic 4109 38 52 0 1050 74 3007 1.265515
Kosovo 3886 183 82 3 2003 57 1801 2.110139
Greece 3622 33 193 0 1374 0 2055 5.328548
Madagascar 3573 101 33 0 1761 574 1779 0.923594
Croatia 3325 53 114 1 2277 48 934 3.428571
Thailand 3197 0 58 0 3074 0 65 1.814201
Albania 3106 68 83 2 1791 47 1232 2.672247
Equatorial Guinea 3071 0 51 0 842 0 2178 1.660697
Somalia 3028 13 92 0 1147 51 1789 3.038309
Nicaragua 2846 0 91 0 1993 0 762 3.197470
Paraguay 2554 52 20 0 1212 19 1322 0.783085
Maldives 2517 16 13 1 2180 22 324 0.516488
Cuba 2399 4 86 0 2242 2 71 3.584827
Mali 2358 10 120 1 1597 41 641 5.089059
Sri Lanka 2094 13 11 0 1967 12 116 0.525310
South Sudan 2021 0 38 0 333 0 1650 1.880257
Estonia 2003 8 69 0 1882 2 52 3.444833
Lebanon 1946 39 36 0 1368 20 542 1.849949
Zambia 1895 0 42 0 1348 0 505 2.216359
Iceland 1880 7 10 0 1850 3 20 0.531915
Malawi 1864 46 24 5 345 28 1495 1.287554
Lithuania 1854 10 79 0 1552 5 223 4.261057
Congo (Brazzaville) 1821 264 47 3 525 24 1249 2.580999
Slovakia 1798 31 28 0 1473 0 297 1.557286
Guinea-Bissau 1790 0 25 0 760 0 1005 1.396648
Slovenia 1763 24 111 0 1429 6 223 6.296086
Sierra Leone 1584 12 63 0 1122 34 399 3.977273
Cabo Verde 1542 43 18 0 730 6 794 1.167315
New Zealand 1540 3 22 0 1494 2 24 1.428571
Hong Kong 1323 24 7 0 1167 6 149 0.529101
Yemen 1318 21 351 3 595 4 372 26.631259
Libya 1268 86 36 1 306 11 926 2.839117
Tunisia 1221 16 50 0 1050 1 121 4.095004
Benin 1199 0 21 0 333 0 845 1.751460
Rwanda 1194 22 3 0 610 15 581 0.251256
Jordan 1169 0 10 0 977 8 182 0.855432
Latvia 1141 7 30 0 1008 0 103 2.629273
Eswatini 1138 82 14 0 588 18 536 1.230228
Niger 1097 3 68 0 976 2 53 6.198724
Mozambique 1071 31 8 0 337 57 726 0.746965
Cyprus 1008 3 19 0 839 0 150 1.884921
Burkina Faso 1003 0 53 0 861 1 89 5.284148
Uganda 977 6 0 0 904 8 73 0.000000
Uruguay 974 9 29 0 871 6 74 2.977413
Georgia 963 5 15 0 841 3 107 1.557632
Montenegro 960 53 17 0 320 0 623 1.770833
Liberia 926 9 41 0 395 1 490 4.427646
Zimbabwe 885 98 9 0 206 5 670 1.016949
Chad 873 0 74 0 788 0 11 8.476518
Andorra 855 0 52 0 802 2 1 6.081871
Jamaica 751 6 10 0 600 1 141 1.331558
Sao Tome and Principe 724 0 13 0 283 4 428 1.795580
Diamond Princess 712 0 13 0 651 0 48 1.825843
San Marino 698 0 42 0 656 0 0 6.017192
Togo 695 6 15 0 475 8 205 2.158273
Malta 673 0 9 0 654 1 10 1.337296
Suriname 665 31 17 2 434 29 214 2.556391
Namibia 593 54 0 0 25 0 568 0.000000
Tanzania 509 0 21 0 183 0 305 4.125737
Taiwan 449 0 7 0 438 0 4 1.559020
Angola 386 0 21 0 117 0 248 5.440415
Syria 372 0 14 0 126 0 232 3.763441
Vietnam 369 0 0 0 347 5 22 0.000000
Mauritius 342 0 10 0 330 0 2 2.923977
Burma 317 1 6 0 250 5 61 1.892744
Botswana 314 0 1 0 31 0 282 0.318471
Comoros 313 2 7 0 272 6 34 2.236422
Guyana 284 0 16 0 125 0 143 5.633803
Mongolia 227 0 0 0 197 2 30 0.000000
Eritrea 215 0 0 0 56 0 159 0.000000
Burundi 191 0 1 0 118 0 72 0.523560
Brunei 141 0 3 0 138 0 0 2.127660
Cambodia 141 0 0 0 131 0 10 0.000000
Trinidad and Tobago 133 0 8 0 117 0 8 6.015038
Monaco 108 0 4 0 95 0 9 3.703704
Bahamas 106 2 11 0 89 0 6 10.377358
Barbados 98 0 7 0 90 0 1 7.142857
Seychelles 91 10 0 0 11 0 80 0.000000
Lesotho 91 0 0 0 11 0 80 0.000000
Liechtenstein 84 0 1 0 81 0 2 1.190476
Bhutan 80 0 0 0 55 0 25 0.000000
Antigua and Barbuda 70 0 3 0 23 0 44 4.285714
Gambia 61 0 3 0 27 0 31 4.918033
Macau 46 0 0 0 45 0 1 0.000000
Belize 30 0 2 0 19 0 9 6.666667
Saint Vincent and the Grenadines 29 0 0 0 29 0 0 0.000000
Timor-Leste 24 0 0 0 24 0 0 0.000000
Grenada 23 0 0 0 23 0 0 0.000000
Saint Lucia 22 0 0 0 19 0 3 0.000000
Fiji 21 0 0 0 18 0 3 0.000000
Laos 19 0 0 0 19 0 0 0.000000
Dominica 18 0 0 0 18 0 0 0.000000
Saint Kitts and Nevis 16 0 0 0 15 0 1 0.000000
Holy See 12 0 0 0 12 0 0 0.000000
Papua New Guinea 11 0 0 0 8 0 3 0.000000
Western Sahara 10 0 1 0 8 0 1 10.000000
MS Zaandam 9 0 2 0 0 0 7 22.222222
world_cumulative = pd.DataFrame(case_data.groupby(['ObservationDate']).sum()).reset_index()

fig = go.Figure()
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Confirmed'],mode='lines',name='Confirmed',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Deaths'],mode='lines',name='Deaths',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Recovered'],mode='lines',name='Recovered',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_cumulative['ObservationDate'],y=world_cumulative['Active'],mode='lines',name='Active',line=dict( width=4)))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.01,y=.98),
                  title_text = 'Number of COVID-19 cases worldwide',font_size=15,
                  xaxis_title="Date",
                  yaxis_title="Number of cases",)
fig.layout.hovermode = 'x'
fig.show()
world_daily_cumulative = pd.DataFrame(case_data.groupby(['ObservationDate']).sum())

fig = go.Figure()
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_confirmed'],mode='lines',name='Confirmed',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_deaths'],mode='lines',name='Deaths',line=dict( width=4)))
fig.add_trace(go.Scatter(x=world_daily_cumulative.index,y=world_daily_cumulative['new_recovered'],mode='lines',name='Recovered',line=dict( width=4)))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.01,y=.98),
                  title_text = 'Number of daily COVID-19 cases worldwide',font_size=15,
                  xaxis_title="Date",
                  yaxis_title="Number of new cases",)
fig.layout.hovermode = 'x'
fig.show()

World Map Visualization

def CreateMap(data,color,fill_color):
    _map = folium.Map(location=[10,5], tiles="Stamen Toner", zoom_start=2.3)
    for name,cases,deaths,lat,lon in data.values:
        folium.CircleMarker([lat,lon],radius=((int(np.log(cases + 1)))*1.5),color=color,fill_color=fill_color,
                            tooltip = "<h5 style='text-align:center;font-weight: bold'>"+ name +"</h5>"+
                            "<hr style='margin:10px;'>"+
                            "<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
                            "<li>Confirmed: "+str(cases)+"</li>"+
                            "<li>Deaths:   "+str(deaths)+"</li>"+
                            "</ul>",fill_opacity=0.7).add_to(_map)
    return _map

df_map_plot = pd.DataFrame()
for i in case_confirmed.index:
    if pd.isna(case_confirmed.loc[i,'Province/State']):
        df_map_plot.loc[i,'location'] = case_confirmed.loc[i,'Country/Region']
    else:
        df_map_plot.loc[i,'location'] = str(case_confirmed.loc[i,'Province/State']) + ", " + case_confirmed.loc[i,'Country/Region']
    if case_confirmed.loc[i,case_confirmed.columns[-1]] < 0 :
        df_map_plot.loc[i,'cases'] = 1
    else:
        df_map_plot.loc[i,'cases'] = case_confirmed.loc[i,case_confirmed.columns[-1]]
        df_map_plot.loc[i,'deaths'] = case_deaths.loc[i,case_deaths.columns[-1]]
    df_map_plot.loc[i,'Lat'] = str(case_confirmed.loc[i,'Lat'])
    df_map_plot.loc[i,'Long'] = str(case_confirmed.loc[i,'Long'])

df_map_plot['Lat'] = df_map_plot['Lat'].astype('float')
df_map_plot['Long'] = df_map_plot['Long'].astype('float')
df_map_plot['cases'] = df_map_plot['cases'].astype('int')
df_map_plot['deaths'] = df_map_plot['deaths'].astype('int')

CreateMap(df_map_plot,'#022474','#4D76D7')
Make this Notebook Trusted to load map: File -> Trust Notebook
temp_df = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index().sort_values(['ObservationDate'])
temp_df['ObservationDate'] = temp_df['ObservationDate'].astype('str')

fig = px.choropleth(temp_df, locations="Country/Region",
                    color=np.log10(temp_df["Confirmed"]),
                    hover_name="Country/Region",
                    hover_data=["Confirmed",'Deaths','Recovered'],
                    color_continuous_scale=px.colors.sequential.thermal_r,
                    locationmode="country names",
                    animation_frame='ObservationDate',color_continuous_midpoint = 3)
fig.update_layout(margin=dict(l=20,r=0,b=0,t=70,pad=0),
                  paper_bgcolor="white",
                  height= 700,
                  title_text = 'Number of daily COVID-19 cases worldwide',font_size=18)
fig.show()

Race Map Visualization

case_confirmed_racemap = pd.DataFrame(case_confirmed.groupby('Country/Region').sum()).reset_index()
case_confirmed_racemap.loc[case_confirmed_racemap['Country/Region'] == 'US','Country/Region'] = 'United States'
case_confirmed_racemap.loc[case_confirmed_racemap['Country/Region'] == 'Korea, South','Country/Region'] = 'South Korea'
for col in case_confirmed_racemap.columns[3:]:
    case_confirmed_racemap.rename(columns = {col:str(pd.to_datetime(col))[0:10]},inplace = True)

flags = pd.read_csv('https://raw.githubusercontent.com/AbshkPskr/Portfolio/master/Country_Flags.csv')
flags = flags.drop('Images File Name',1).rename(columns={'Country':'Country/Region'})
confirmed_with_flag = case_confirmed_racemap.merge(flags,on='Country/Region',how='left')
confirmed_with_flag.to_csv('country.csv')
files.download('country.csv')
%%HTML
<div class="flourish-embed flourish-bar-chart-race" data-src="visualisation/2059674" data-url="https://flo.uri.sh/visualisation/2059674/embed"><script src="https://public.flourish.studio/resources/embed.js"></script></div>

Comparison between most affected countries

Countries with more than 1 lakh cases

case_country = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
countries = case_country[case_country['Confirmed'] > 80000]['Country/Region'].unique()

def CreateComparisonPlot(attr,title,x_title,y_title, y_axis_type = None, exclude = 'United States'):
    fig = go.Figure()
    for i in countries:
        if i == exclude: continue
        if exclude == None :
            one_country = case_country[(case_country['Country/Region'] == i) & (case_country[attr] > 0)][['ObservationDate',attr]]
            if y_axis_type == 'log': one_country['ObservationDate'] = [i for i in range(1,len(one_country)+1)]
        else:
            one_country = case_country[(case_country['Country/Region'] == i)][['ObservationDate',attr]]

        fig.add_trace(go.Scatter(x=one_country['ObservationDate'],
                                y=one_country[attr],mode='lines',
                                name=i,line=dict( width=4)))
        
    # fig.add_annotation(text="First case of United states",
    #                    x='2020-01-23', y=1, arrowhead=0, showarrow=True)
    fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                      paper_bgcolor="white",height= 600,
                      legend=dict(x=.01,y=.98,font=dict(size =12)),
                      title_text = title,font_size=15,
                      xaxis_title=x_title,
                      yaxis_title=y_title)
    if y_axis_type == 'log' : 
        fig.update_yaxes(type="log")
        fig.update_layout(legend=dict(x=.85,y=.02,font=dict(size =12)))
        
    fig.layout.hovermode = 'x'
    fig.show()
CreateComparisonPlot('Confirmed','Number of confirmed cases for most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('Deaths','Number of Deaths in most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('Recovered','Number of Recovered cases in most affected countries (Excluding USA)','Date','Number of cases')
CreateComparisonPlot('new_confirmed','Number of Daily Confirmed cases in most affected countries','Date','Number of cases',None,None)
CreateComparisonPlot('new_deaths','Number of Daily Death cases in most affected countries','Date','Number of cases',None,None)
CreateComparisonPlot('Confirmed','Number of confirmed cases for most affected countries',
                     'Day','Number of cases (log scale)','log',None)
CreateComparisonPlot('Deaths','Number of confirmed cases for most affected countries',
                     'Day','Number of cases (log scale)','log',None)
CreateComparisonPlot('Recovered','Number of confirmed cases for most affected countries',
                     'Day','Number of cases (log scale)','log',None)

Trend of each country separately

case_country = case_data.groupby(['Country/Region','ObservationDate']).sum().reset_index()
case_country['ObservationDate'] = case_country['ObservationDate'].astype('str')

countries = case_country[case_country['Confirmed'] > 200].sort_values(['Confirmed'],ascending=False)['Country/Region'].unique()

# row = int((len(countries)+1)/3)
rows = len(countries)
columns = 2
f = plt.figure(figsize=(20,rows*5))
gs = f.add_gridspec(rows,columns)
sns.set(style = "whitegrid")

country = 0
for i in range(0,rows):
    data = case_country[case_country['Country/Region'] == countries[country]]
    data = data[data['Confirmed'] > 0]
    data['day'] = [i for i in range(1,len(data)+1)]

    for j in range(0,columns):
        #if country == len(countries): break
        ax = f.add_subplot(gs[i,j])

        if j == 0:
            sns.scatterplot(data = data,x = 'day',y = 'Confirmed',s = 50,color="#4348C4",edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'Confirmed', color = '#4348C4')
            plt.fill_between(data['day'], data['Confirmed'], alpha=0.30, color = '#4348C4')

            sns.scatterplot(data = data,x = 'day',y = 'Recovered',s= 50,color="#5BAC4D",edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'Recovered',color="#5BAC4D")
            plt.fill_between(data['day'], data['Recovered'],alpha=0.30,color="#5BAC4D")

            sns.scatterplot(data = data,x = 'day',y = 'Deaths',s= 50,color='#BB3535',edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'Deaths',color='#BB3535')
            plt.fill_between(data['day'], data['Deaths'],alpha=0.30,color='#BB3535')

            ax.legend(['Confirmed','Recovered','Deaths'], fontsize=16,loc = 'upper left')
            ax.set_xlabel('Day',fontsize=20)
            ax.set_ylabel('No. of cases',fontsize=20)
            ax.set_title(countries[country], fontdict={'fontsize': 25, 'weight' : 'bold'}, color="black",loc = 'left')
            
            text = str(data['ObservationDate'].values[0]) + " - " + str(data['ObservationDate'].values[-1]) + "\n" + "\n" 
            text += 'Total Cases :-' + "\n" 
            text += 'Confirmed :' + str(data['Confirmed'].values[-1]) + "\n" 
            text += 'Deaths      :' + str(data['Deaths'].values[-1]) + "\n" 
            text += 'Recovered :' + str(data['Recovered'].values[-1]) + "\n" + "\n" 

            ax.text(0.02, 0.15, text, fontsize=15, transform=ax.transAxes,bbox=dict(facecolor='#F8F8F8', alpha=.7))

        if j == 1:
            sns.scatterplot(data = data,x = 'day',y = 'new_confirmed',s = 50,color="#4348C4",edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'new_confirmed', color = '#4348C4')
            plt.fill_between(data['day'], data['new_confirmed'], alpha=0.30, color = '#4348C4')

            sns.scatterplot(data = data,x = 'day',y = 'new_recovered',s= 50,color="#5BAC4D",edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'new_recovered',color="#5BAC4D")
            plt.fill_between(data['day'], data['new_recovered'],alpha=0.30,color="#5BAC4D")

            sns.scatterplot(data = data,x = 'day',y = 'new_deaths',s= 50,color='#BB3535',edgecolor = 'none')
            sns.lineplot(data = data,x = 'day',y = 'new_deaths',color='#BB3535')
            plt.fill_between(data['day'], data['new_deaths'],alpha=0.30,color='#BB3535')
            
            ax.legend(['Daily Confirmed','Daily Recovered','Daily Deaths'], fontsize=16,loc = 'upper left')
            ax.set_xlabel('Day',fontsize=20)
            ax.set_ylabel('No. of cases',fontsize=20)
            
            text = 'In last 24 hrs :-' + "\n" 
            text += 'Confirmed :' + str(data['new_confirmed'].values[-1]) + "\n" 
            text += 'Deaths      :' + str(data['new_deaths'].values[-1]) + "\n" 
            text += 'Recovered :' + str(data['new_recovered'].values[-1]) 

            ax.text(0.02, 0.3, text, fontsize=15, transform=ax.transAxes,bbox=dict(facecolor='#F8F8F8', alpha=.7))

    country += 1

f.tight_layout()

Age Analysis

covid_data = pd.read_csv("COVID19_line_list_data.csv")
covid_data['death'] = [0 if i == '0' else 1 for i in covid_data.death]
for i in covid_data.index:
    age = covid_data.loc[covid_data.index == i,'age']
    try:
        age = int(age)
    except:
        continue
    if age <= 10: covid_data.loc[covid_data.index == i,'age'] = '1-10'
    if 11 <= age <= 20: covid_data.loc[covid_data.index == i,'age'] = '11-20'
    if 21 <= age <= 30: covid_data.loc[covid_data.index == i,'age'] = '21-30'
    if 31 <= age <= 40: covid_data.loc[covid_data.index == i,'age'] = '31-40'
    if 41 <= age <= 50: covid_data.loc[covid_data.index == i,'age'] = '41-50'
    if 51 <= age <= 60: covid_data.loc[covid_data.index == i,'age'] = '51-60'
    if 61 <= age <= 70: covid_data.loc[covid_data.index == i,'age'] = '61-70'
    if age > 70: covid_data.loc[covid_data.index == i,'age'] = '70+'
a = covid_data.groupby('age')['id'].count().reset_index()
fig = go.Figure().add_trace(go.Pie(values=a.id,labels= a.age,hole = .5))
fig.update_layout(margin=dict(l=0,r=20,b=20,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.825,y=.98),
                  title_text = '',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Rate")
fig.layout.hovermode = 'x'
fig.show()

Testing Analysis

tests = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/testing/covid-testing-all-observations.csv')
tests.drop(tests[tests["Entity"] == "India - people tested"].index,inplace=True)
tests.drop(tests[tests["Entity"] == "United Kingdom - people tested"].index,inplace=True)
tests.drop(tests[tests["Entity"] == "Singapore - people tested"].index,inplace=True)
tests.drop(tests[tests["Entity"] == "United States - specimens tested (CDC)"].index,inplace=True)
tests['Country'] = [i.split(" - ")[0] for i in tests['Entity'].values]
tests['Cumulative total per million'] = tests['Cumulative total per thousand']*1000
tests_data = tests.groupby('Country')['Cumulative total','Cumulative total per million'].max().reset_index()
con = pd.DataFrame(case_confirmed.groupby('Country/Region')[case_confirmed.columns[-1]].sum()).reset_index()
con = con.rename(columns = {'Country/Region':'Country',case_confirmed.columns[-1]:'Confirmed'})
con.loc[(con['Country'] == 'Korea, South'),'Country'] = 'South Korea'
con.loc[(con['Country'] == 'US'),'Country'] = 'United States'
con.loc[(con['Country'] == 'Taiwan*'),'Country'] = 'Taiwan'
tests_data = tests_data.merge(con,how = 'left', on = 'Country')
det = pd.DataFrame(case_deaths.groupby('Country/Region')[case_confirmed.columns[-1]].sum()).reset_index()
det = det.rename(columns = {'Country/Region':'Country',case_confirmed.columns[-1]:'Deaths'})
det.loc[(det['Country'] == 'Korea, South'),'Country'] = 'South Korea'
det.loc[(det['Country'] == 'US'),'Country'] = 'United States'
det.loc[(det['Country'] == 'Taiwan*'),'Country'] = 'Taiwan'
tests_data = tests_data.merge(det,how = 'left', on = 'Country')
tests_data['Mortality Rate'] = (tests_data['Deaths']/tests_data['Confirmed'])*100
tests_data['Spread Rate'] = (tests_data['Confirmed']/tests_data['Cumulative total'])*100
tests_data = tests_data.sort_values('Cumulative total',ascending=False).reset_index().drop('index',axis = 1).dropna()
tests_data['Cumulative total per million'] = tests_data['Cumulative total per million'].astype('int')
tests_data['Confirmed'] = tests_data['Confirmed'].astype('int')
tests_data['Deaths'] = tests_data['Deaths'].astype('int')
tests_data.head(40).style.background_gradient(cmap='BuGn',subset=['Cumulative total'])\
.background_gradient(cmap='bone_r',subset=['Cumulative total per million'])\
.background_gradient(cmap='Blues',subset=['Confirmed'])\
.background_gradient(cmap='Reds',subset=['Deaths'])\
.background_gradient(cmap='YlOrBr',subset=['Mortality Rate'])\
.background_gradient(cmap='Purples',subset=['Spread Rate'])
Country Cumulative total Cumulative total per million Confirmed Deaths Mortality Rate Spread Rate
0 United States 39011749.000000 117859 3054699 132300 4.331032 7.830203
1 Russia 22079294.000000 151296 699749 10650 1.521974 3.169254
2 India 10740832.000000 7782 767296 21129 2.753696 7.143730
3 United Kingdom 6717480.000000 98952 288511 44602 15.459376 4.294929
4 Germany 6376054.000000 76101 198699 9046 4.552615 3.116332
5 Italy 5806668.000000 96039 242149 34914 14.418395 4.170188
6 Turkey 3782520.000000 44849 208938 5282 2.528023 5.523778
7 Spain 3644458.000000 77948 252513 28396 11.245362 6.928685
8 Canada 3055265.000000 80951 108334 8786 8.110104 3.545814
9 Australia 2910831.000000 114151 9056 106 1.170495 0.311114
10 Saudi Arabia 2062535.000000 59245 220144 2059 0.935297 10.673467
11 South Africa 1944399.000000 32784 224665 3602 1.603276 11.554470
12 Iran 1897803.000000 22595 248379 12084 4.865146 13.087712
13 Poland 1696572.000000 44828 36689 1542 4.202895 2.162537
14 Kazakhstan 1620252.000000 86291 51059 264 0.517049 3.151300
15 Pakistan 1491437.000000 6752 240848 4983 2.068940 16.148721
16 Brazil 1478671.000000 6957 1713160 67964 3.967172 115.858091
17 South Korea 1347759.000000 26288 13293 287 2.159031 0.986304
18 Portugal 1299594.000000 127452 44859 1631 3.635837 3.451770
19 Chile 1220790.000000 63861 303083 6573 2.168713 24.826792
20 Denmark 1169208.000000 201859 13101 609 4.648500 1.120502
21 Belarus 1074240.000000 113684 64224 443 0.689773 5.978552
22 Israel 1047501.000000 121021 33557 344 1.025121 3.203529
23 Belgium 1034005.000000 89217 62123 9776 15.736523 6.007998
24 Bangladesh 907784.000000 5512 172134 2197 1.276331 18.962000
25 Colombia 902305.000000 17733 124494 4606 3.699777 13.797330
26 Singapore 866414.000000 148096 45298 26 0.057398 5.228217
27 France 831174.000000 12734 206072 29936 14.526961 24.792883
28 Malaysia 822646.000000 25416 8677 121 1.394491 1.054767
29 Philippines 822098.000000 7502 50359 1314 2.609265 6.125669
30 Morocco 819124.000000 22191 14771 242 1.638345 1.803268
31 Romania 809663.000000 42086 30175 1817 6.021541 3.726859
32 Ukraine 760247.000000 17384 51457 1323 2.571079 6.768458
33 Japan 734568.000000 5808 20261 982 4.846750 2.758220
34 Netherlands 681931.000000 39798 50959 6154 12.076375 7.472750
35 Austria 675727.000000 75027 18513 706 3.813536 2.739716
36 Switzerland 661362.000000 76417 32498 1966 6.049603 4.913799
37 Bahrain 630753.000000 370686 30931 98 0.316834 4.903821
38 Thailand 603657.000000 8648 3197 58 1.814201 0.529605
39 Sweden 600019.000000 59412 73858 5482 7.422351 12.309277
tests = tests_data.head(40)

fig = go.Figure()
fig.add_trace(go.Bar(x=tests['Country'],y=tests['Cumulative total'],
                     text=tests['Confirmed'],name='Confirmed',marker= { 'color': 'rgb(47,138,0)'}))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.01,y=.98),
                  title_text = 'Number of tests done by countries',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Number of tests (in million)",)
fig.layout.hovermode = 'x'
fig.show()
tests_per_million = tests_data.sort_values('Cumulative total per million',ascending = False).head(40)

fig = go.Figure()
fig.add_trace(go.Bar(x=tests_per_million['Country'],y=tests_per_million['Cumulative total per million'],
                     name='Confirmed',marker= { 'color': 'rgb(105,67,144)'}))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.01,y=.98),
                  title_text = 'Number of tests per million of population',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Number of tests per million",)
fig.layout.hovermode = 'x'
fig.show()
tests_comparison = tests_data.head(40)

fig = go.Figure()
fig.add_trace(go.Bar(x=tests_comparison['Country'],y=tests['Cumulative total per million'],
                     name = 'Tests per million',marker= { 'color': 'rgb(105,67,144)'}))
fig.add_trace(go.Bar(x=tests_comparison['Country'],y=tests['Cumulative total'],
                     name = 'Total Tests',marker= { 'color': 'rgb(47,138,0)'}))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.825,y=.98),
                  title_text = 'No. of tests and No. of test per million comparison',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Number of tests (log scale)",
                  yaxis_type="log",
                  barmode='stack')
fig.layout.hovermode = 'x'
fig.show()

Mortality and Spread Rates Comparison

global_case_data = case_data.groupby('ObservationDate').sum().reset_index()
global_case_data['Mortality Rate'] = (global_case_data['Deaths']/global_case_data['Confirmed'])*100

fig = go.Figure()
fig.add_trace(go.Scatter(x=global_case_data['ObservationDate'],y=global_case_data['Mortality Rate'],mode='lines+markers',
                         fill='tozeroy',name = 'Total Tests',marker= { 'color': '#B87625','size' : 10},line=dict( width=4)))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.825,y=.98),
                  title_text = 'Global Mortality Rate',font_size=15,
                  xaxis_title="Date",
                  yaxis_title="Rate")
fig.layout.hovermode = 'x'
fig.show()
tests_mortality = tests_data.sort_values('Mortality Rate',ascending = False).head(40)

fig = go.Figure()
fig.add_trace(go.Bar(x=tests_mortality['Country'],y=tests_mortality['Mortality Rate'],
                     name = 'Total Tests',marker= { 'color': '#D38B2C'}))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.825,y=.98),
                  title_text = 'Mortality Rate (No. of Confirmed / No. of Deaths)',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Rate")
fig.layout.hovermode = 'x'
fig.show()
tests_spread = tests_data.sort_values('Spread Rate',ascending = False).head(40)

fig = go.Figure()
fig.add_trace(go.Bar(x=tests_spread['Country'],y=tests_spread['Spread Rate'],
                     name = 'Total Tests',marker= { 'color': '#B325B8'}))
fig.update_layout(margin=dict(l=0,r=20,b=0,t=60,pad=0),
                  paper_bgcolor="white",height= 600,
                  legend=dict(x=.825,y=.98),
                  title_text = 'Spreat Rate (No. of Tests / No. of Confirmed cases)',font_size=15,
                  xaxis_title="Country",
                  yaxis_title="Rate")
fig.layout.hovermode = 'x'
fig.show()