The Problem;
Assuming you have a variable that you collected unique
IDs
likeHHIDs
and would like to see if there are any duplicates;
#The required libraries
require(tidyverse)
#Load the libraries
library(tidyverse)
hhIDs_Dups <-
DF %>%
group_by(hhID) %>% # the group of interest (hhIDs)
mutate(duplicate = n()) %>% # count number in each group
filter(duplicate > 1) %>% # select only duplicated records
select(-duplicate) # remove group count column
If you now View(hhIDs_Dups)
, you will only see the duplicate records.
Its advisable to add some metadata to your DF
like location and other respondent info for easier identification of which record to update/change.