A little trick in reading data

· Data manipulation, GNU-R

Census 1991 Primary Census Abstract files use a variable which is an 18 digit string. The string is a code that comprises codes for district, tehsil, block, panchayat and village. The village and town directories, however, give data for each village where villages are identified by three variables that capture the same codes for district, block and village (but in three different variables). To match the two sets of data, the 18 digit string had to be split into five different variables. Here is a little piece of code that did it.

#take out the variable code from distvc into a separate data frame called code
#write this data frame into a text file

#read this text file using read.fwf, reading five different variables

#assign names to these variables

#combine the new data frame code2 with the old data frame distvc

#delete the temporary objects

%d bloggers like this: