The first post (found here) we downloaded the data and imported it to R
using the gdata package. This post we will be changing the column names to make
them more reasonable, and adding a quarter variable. The reason for changing
the column names is because the dw.2010.q1 file column names are messed up due
to the formatting done in Excel. So if I was going to have to change one, just
as well change them all, so i did.
The first chunk of code defines the labels I am going to use
as c.label. Then
I used the colnames() function to rename each file.
#Defining the new labels c.label<-c('loan.date', 'mat.date', 'term', 'repay.date', 'district', 'borrower', 'city', 'state', 'ABA', 'type.credit', 'i.rate', 'amount', 'outstanding.credit', 'total.outstanding', 'collateral', 'commercial', 'residential.morg', 'comm.real', 'consumer', 'treasury', 'municipal', 'corp', 'mbs.cmo', 'mbs.cmo.other', 'asset.backed', 'internat', 'tdfd') #Changing the column names colnames(dw.2010.q3)<-c.label colnames(dw.2010.q4)<-c.label colnames(dw.2011.q1)<-c.label
I also like to add a few additional variables when I see a potential need when I can. At this point the files are individual, and adding the quarter variable might be helpful. Sure I could write a loop to create the new column based on the month of the date, but I like to keep things as simple as possible. Why add complexity when there is no reason. I used the ABA to define the length of the data set because it did not have any missing values, while others did. The new column name is qtr, and the function rep() is used to repeat the quarter number the length of the column ABA.
No comments:
Post a Comment