Category archives for Data manipulation

Dropping columns in subset command

Use “select=c(var1,var2)” in the subset command to select var1 and var2. Use “select=-c(var1,var2)” in the subset command to drop var1 and var2. Technorati Tags: GNU-R Advertisements

Renaming variables in a dataframe

There is no direct command in R for renaming variables and that may make it less than obvious for some people. Of course, once you know, it is simple. The following command does the trick. names(dataframe)[names(dataframe)==”oldvariablename”]<-“newvariablename”   VR   Technorati: GNU-R

Use reshape to tabulate

Package reshape is meant for aggregating, reshaping and tabulating data.Tabulation is done in two steps: melt and cast. Read help for these functions. melt(sl1,id.var=,measure.var=”foo”)->sl2 This will create a dataframe sl2 which will have all the variablesin sl1 and “foo” being reorganised for casting later. See head(sl2) to see the form it takes. The following command […]

Replacing selected values of variables

You can replace selected values of variables in a dataframe (or any other r object) by using the function replace. The documentation on the function is straightforward and comprehensive.

More on “A Little Trick in Reading Data”

Here is an example of the beauty of R. To split up a character string, all you need is to use a function called substr (for sub string)!! So you don’t really need to write the variable into a new file and read it back with read.fwf as I did (see my earlier post titled […]

Subset

The following text from the R help pages clearly explains the use of command subset in extracting data from a data.frame. This command will be of much use in preparation of data set for further analysis. Vikas ————— subset {base} R Documentation Subsetting Vectors and Data Frames Description Return subsets of vectors or data frames […]

A little trick in reading data

Census 1991 Primary Census Abstract files use a variable which is an 18 digit string. The string is a code that comprises codes for district, tehsil, block, panchayat and village. The village and town directories, however, give data for each village where villages are identified by three variables that capture the same codes for district, […]