An Exploration of COVID-19

Welcome to my final project for the datascience bootcamp!

For my project, I’m going to use the data from the CDC. Here’s the tibble that I’m going to be using:

# A tibble: 53,520 × 15
   submission_date state tot_cases conf_cases prob_cases new_case
   <chr>           <chr>     <dbl>      <dbl>      <dbl>    <dbl>
 1 12/22/2021      DE       165076     151750      13326      662
 2 03/18/2021      NE       206980         NA         NA      298
 3 09/01/2021      ND       118491     107475      11016      536
 4 03/28/2022      VT       107785         NA         NA      467
 5 03/11/2021      MD       390490         NA         NA      924
 6 04/21/2022      ID       445350     348949      96401        0
 7 02/02/2021      IL      1130917    1130917          0     2304
 8 12/13/2020      MD       234647         NA         NA     2638
 9 06/15/2020      WI        25480      22932       2548      185
10 03/10/2020      CA          157        157          0       24
# … with 53,510 more rows, and 9 more variables: pnew_case <dbl>,
#   tot_death <dbl>, conf_death <dbl>, prob_death <dbl>,
#   new_death <dbl>, pnew_death <dbl>, created_at <chr>,
#   consent_cases <chr>, consent_deaths <chr>

With this table, we can access a large amount of data on COVID-19. For example, I can access information about total cases and deaths, as well as daily stats.