OutLie..R: Fed Loan Data Part 1

Tuesday, December 27, 2011

Fed Loan Data Part 1

This is the start of analyzing the Federal Reserve and Banking data mentioned in my "A Christmas Miracle". The file is a combination of summary data and actual data from each of the 400+ banks that recieved funds from the Federal Reserve during 2007-2009.

Like so many data sets there are some data clean up challenges. The first is the use of excel, which is not a problem, but the authors decided to add considerable headers, and unusual formats to the summary tables. Here is my attempt to get one of the first data files cleaned up and working. One big problem is changing $1,000.00 into 1000.00, I have some code below, but would appreciate any help in making my code better. Below are the summary graphs of the data:

The basic graphs are pretty self explanatory, for fun I did a regression to see if there is any correlation, the first regression was okay, but i noticed it could use a semi-log transformation. After taking the log of the average daily balance, i got a much better looking regression as well as r^2.

Below is my R code:

#Fed Data
library(stringr)
fed.01<-read.csv(file.choose(), header=T)
summary(fed.01)
 
#Cleaning up the Data- removed the $ sign and the ',' in 1,000
average<-str_sub(fed.01$Average.Daily.Balance..in.Millions.of.U.S..Dollars.,
                 start=2, end=-1)
average<-as.numeric(gsub(",", '', average))
fed.02<-cbind(average, fed.01)
 
#Exploritory Graphs
hist(fed.02$Days.in.Debt, main='Histogram of Days in Debt',
     col='red', xlab='no. Days in Debt')
years.debt<-fed.02$Days.in.Debt/365
hist(years.debt, main='Histogram of Years in Debt', col='red',
     xlab='no. Years in Debt', breaks=15)
 
#Bar graphs the the data
#The country of origin
par(las=2, mar=c(5,12,4,2), mfrow=c(1,1))
country<-sort(table(fed.01$Country))
barplot(country, main='Nation of Banks', col='blue', horiz=TRUE)
 
#Type of Bank or Industry
par(las=2, mar=c(5,17,4,2), mfrow=c(1,1))
industry<-sort(table(fed.01$Industry))
barplot(industry, main='Type of Industry',
        col='blue', horiz=TRUE)
 
#Organizations with average balances greater than $5 billion
five.bill<-subset(fed.02, average>5000)
par(las=2, mar=c(5,19,4,2), mfrow=c(1,1))
barplot(sort(five.bill$average),names.arg=five.bill$Company,
        main='Companies With Average Daily Balance Greater
        than $5 Billion', col='blue', hor=TRUE)
 
 
#Organizations with debt more then 730 days (2 years)
year.comp<-subset(fed.02, Days.in.Debt>730)
par(las=1, mar=c(5,20,4,2))
barplot(sort(year.comp$Days.in.Debt),
        names.arg=year.comp$Company,
        main='Companies With Days of Debt Greater
        than 730 Days (2 Years)
        days', col='red', hor=TRUE, xpd=FALSE, 
        xlim=c(720, 830))
par(las=0, mar=c(5,4,4,2))
 
#Regression of Days in Debt to Ave. Daily Balance
 
#ploted the data, the r2 is poor, and the slop is positive, 
#nothing to get too excited about, took the log
plot(fed.02$Days.in.Debt, fed.02$average, xlab='Days in Debt',
     ylab='Ave. Daily Balance', main='Scatter Plot:
     Daily Balance and Days in Debt')
lm.01<-lm(fed.02$average~fed.02$Days.in.Debt)
abline(lm.01)
summary(lm.01)
 
#log of fed$average to reduce the outliers
log.aver<-log(fed.02$average)
plot(fed.02$Days.in.Debt, log.aver, xlab='Days in Debt',
     ylab='Log of Ave. Daily Balance', main='Scatter Plot:
     Log Daily Balance and Days in Debt')
lm.02<-lm(log.aver~fed.02$Days.in.Debt)
abline(lm.02)
summary(lm.02)

Created by Pretty R at inside-R.org

16 comments:

einar hjörleifssonDecember 27, 2011 at 4:45 PM
thanks for pointing out this source, looks interesting.
str_replace_all may be a better function for getting rid of $ , and spaces (see usage in the code below).
regarding the file on the individual companies the following script may do the trick:

library(stringr)
library(lubridate)
library(reshape2)
setwd("~/prj/2012/01theFedBank/") # path to wherever the zip file resides
zFile <- "201112221200_fed_data_files_for_public_release.zip"
dat <- read.csv(unz(zFile,"1b_Company_Index_1.csv"),stringsAsFactors=F)
names(dat) <- c("id","fileName","company","ticker","peakDate","peakAmount",
"country","industry","capitalRaised","average","daysInDept")

DAT <- NULL
for (i in 1:length(dat$fileName)) {
print(i)
tmp <- read.csv(unz(zFile,dat$fileName[i]),skip=12,header=TRUE)
names(tmp)[1:4] <- c("date","balance","stock","pmc")
#tmp <- tmp[,-4]
tmp$company <- dat$company[i]
tmp <- melt(tmp,id.var=c("date","company"))
DAT <- rbind(DAT,tmp)
}
DAT$date <- mdy(as.character(DAT$date))
DAT$value <- as.numeric(str_replace_all(DAT$value,"([%$, ])", ""))

einar hjörleifsson
ReplyDelete
Replies
0utlieRDecember 31, 2011 at 2:34 PM
einar hjörleifsson

thanks for the code. I am working my way through it, seems much better. Thanks.

Outlier
ReplyDelete
Replies
UnknownAugust 3, 2012 at 12:25 AM
You guys make it really easy for all the folks out there.
unsecured loans online
ReplyDelete
Replies
UnknownAugust 6, 2012 at 4:33 AM
Your site is very informative and your articles are wonderful.
pay day loan application
ReplyDelete
Replies
UnknownFebruary 16, 2013 at 1:02 AM
This is nice post which I was awaiting for such an article and I have gained some useful information from this site related loan. Thanks for sharing this information.
Thanks BY:
Easy Payday Loans
Loans for Payday
ReplyDelete
Replies
UnknownApril 8, 2013 at 1:59 AM
Hi, i read this post & I found informative points regarding loan please keep posting in future I appreciate you thanks 12 month loans, one year loans, 12 month loan, 1 year loans, 1 year loan. http://www.12monthcashloans.me.uk

ReplyDelete
Replies
UnknownApril 19, 2013 at 6:51 AM
It is recommended that you should compare the existing offers with your favorite ones because the continuous influx of financial institutions has developed a competitive atmosphere among lenders. http://yesloans1.org.uk
ReplyDelete
Replies
paydayeasycashadvanceloansSeptember 16, 2013 at 5:25 AM
Very nice post its really useful for all borrowers.
12 month payday loans
ReplyDelete
Replies
12monthspaydayOctober 4, 2013 at 4:30 AM
The chart preparation is very nice.
Loans for 12 months
ReplyDelete
Replies
12monthspaydayOctober 4, 2013 at 4:30 AM
Thousands of people use us to obtain payday online loans and all of them do it for different reasons

Loans for 12 months
ReplyDelete
Replies
AnonymousOctober 12, 2013 at 8:55 AM
Many people use us to acquire pay day loan online financial loans and all of them do it for different reasons
fast cash loans
ReplyDelete
Replies
AnonymousJanuary 8, 2014 at 1:23 AM
"This blog is further than my expectations. Nice work guys!!! The quality of your articles and contents is great.
"
12 month loans
ReplyDelete
Replies
AnonymousDecember 15, 2014 at 11:26 PM
Nice Blog Useful details.

Automatic Enrolment & Workplace Pensions Bristol

ReplyDelete
Replies
AnonymousSeptember 16, 2015 at 12:13 AM
Thanks fellow your posts are really very good for me since it make good sense for me. payday loan
ReplyDelete
Replies
الرائعDecember 7, 2015 at 3:06 PM
شركة الطيب
شركة تنظيف بمكة
شركة نقل اثاث بمكة
شركة تنظيف منازل بمكة
شركة تنظيف خزانات بمكة
شركة مكافحة حشرات بمكة
شركة تنظيف شقق بالدمام
شركة نقل اثاث بالدمام
ReplyDelete
Replies
HiiJanuary 9, 2019 at 11:19 PM
myfedloan
myfedloan contact

metdental
ReplyDelete
Replies

Add comment