This series of analysis notebook were created to deliver an R session for Data Science students. The primary objective is to showcase a few important features of the language like how to manage projects, processing datasets with the tidyverse set of packages, using ggplot2 to develop visualisations, GIS analysis through interactive maps developed using leaflet, etc.
Another important feature of this course is to learn the basics of reproducible research.
For the purpose of explaining the features highlighted above, we will be using the dataset of 2019 Indian general elections. A few rich datasets for analysing constituencies, electors and candidates are available in the public domain. The data is collected by the Election Commission of India and made available on the website post the elections.
Using these datasets, we’ll explore data points such as:
The above set of pages were generated by rendering the R Markdown documents. You can refer to this link to understand and follow the step by step process for processing the datasets for analysis.
The constituency dataset can be explored interactively here. Use these links to download the processed datasets:
Datasets from the Election Commission of India - Link
“TCPD Indian Elections Data v2.0″, Trivedi Centre for Political Data, Ashoka University. Ananay Agarwal, Neelesh Agrawal, Saloni Bhogale, Sudheendra Hangal, Francesca Refsum Jensenius, Mohit Kumar, Chinmay Narayan, Basim U Nissa, Priyamvada Trivedi, and Gilles Verniers. 2021. - Link
Parliamentary Constituencies (PC) shape files from DataMeet - Link
Books
This session was curated by Apoorv Anand for the Online Workshop Series on Learning R
.