Peak Oil News

 

  Login or Register
 
Menu
 News
 Search
 Topics
 Stories Archive
 Submit News
 Discussions
 Code of Conduct
 Forums
 Forums Search
 Last 24 Hours
 PO 24hrs
 Peak Blog
 Resources
 About Us
 Downloads
 Web Links
 PeakWiki
 PeakPortal
 Focus Search
 Peak TV
 Peak Oil Boston
 Members
 Your Account
 Members List
 Ignore List
 JOIN!
 Private Messages
 
google
 
PeakSpeak
NICKNAME

Download TeamSpeak
What is PeakSpeak?
Peak Oil on IRC
 
Photo Album
Submit Photo
Peakoil.com is You!


member photos
 
Light Sweet Crude Oil
 
Member Quotes
If "it's bunker time" why the fark do you care about the price of gold? You evolved some enzyme that lets you digest the stuff?

Narz

Suggest Quote

 
aspo08
 
ICM
Cisco & Net App Training
 
Peak Oil News: Forums

Peakoil.com :: View topic - Bootstrapping Technique Applied to the Hubbert Linearization
 Forum FAQForum FAQ   SearchSearch   UsergroupsUsergroups   ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Bootstrapping Technique Applied to the Hubbert Linearization

 
Post new topic   Reply to topic   Printer-friendly version    Peakoil.com Forum Index -> Depletion Modeling
View previous topic :: View next topic  
Author Message
khebab
Moderator
Moderator


Joined: Sep 27, 2004
Posts: 935
Location: Canada

PostPosted: Tue Jan 10, 2006 3:09 pm    Post subject: Bootstrapping Technique Applied to the Hubbert Linearization Add User to Ignore List Reply with quote

Motivation:

This thread is a continuation of How Reliable is the Hubbert Linearization Method?. The main motivation here is again to gain some insight in the robusteness, the limitations of the curve fitting aproach. One main issue is the lack of confidence intervals on crucial parameters (Ultimate Recoverable Ressource, growth rate and peak date).

What is bootstrapping? it's a way to repeat an experiment. With the Bootstrap, a new set of experiment is not needed, the original data is reused. Specifically, the original observations are randomly reassigned and the estimate is recomputed. These assignments and recomputations are done a large number of times and considered as repeated assignements.

Why is the bootstrap attractive? It can answer many questions with very little in the way of modelling, assumptions and can be applied easily.

In general, the bootstrap is a methodology for answering the following question: How accurate is a parameter estimator?. Bootstrapping is a fairly new method in statistics and has been proposed by Efron in 1979. The R software (open source Smile) has a bootstrap library called 'boot' that implements almost all the standard techniques. I won't go into the details of the Bootstrap theory, a lot of details about the techniques and the R language implementation used in this post can be found in the following document:

John Fox. Bootstrapping Regression Models, Appendix to an R and S-plus Companion to Applied Regression

Application:

I restate the Hubbert linearization equation:
Code:
P(t)/Q(t)= K(1 - Q(t)/URR)

where Q(t) is the cumulative production at time t, K is the growth rate and URR=Q(t=+inf) (for a good discussion on this approach: Another Way of Looking at CERA).

The BP production data are used for the world production.

For each starting year Y we randomly generate 2,000 new samples (called bootstrap replicates) taken in the time range [Y:2004] on which a robust least-square fitting is perfomed (function rlm).


large version
(a) URR

large version
(b) K
Histograms and normal quantile-comparison plots for the bootstrap replications of the URR (a) and K (b). The broken vertical line in each histogram shows the original value for the model fit to the original sample.


The confidence intervals are derived from the bootstrap replicates using the so-called bias-corrected accelerated method (BCa). The R script is given in the post below and has produced the following numerical results.

Code:
   Y    URR(50%)U URR(50%)L  K(50%)L  K(50%)U    URR(90%)L URR(90%)U  K(90%)L   K(90%)U   URR(95%)L URR(95%)U  K(95%)L     K(95%)U
1  1940 1589.852 1666.108 0.06553184 0.06803704 -1047.8955 1722.340 0.06229963 0.07005338 -2115.754 1738.640 0.05931496 0.07057659
2  1945 1534.121 1604.606 0.06716867 0.06970711  -909.1412 1655.086 0.06448430 0.07143187 -1694.621 1674.993 0.06099418 0.07215952
3  1950 1474.262 1535.714 0.06936133 0.07203309  1415.9148 1580.065 0.06720602 0.07404873  1371.762 1593.248 0.06624411 0.07463725
4  1955 1411.245 1479.824 0.07170182 0.07469338  1358.8073 1529.928 0.06909329 0.07690102  1332.695 1570.994 0.06755400 0.07777587
5  1960 1344.488 1429.857 0.07396238 0.07755941  1288.4431 1999.462 0.05258170 0.08004962  1269.352 2069.939 0.05197241 0.08069580
6  1965 1251.694 1362.724 0.07767415 0.08219249  1127.0189 2032.670 0.05220087 0.08584747  1048.126 2077.450 0.05171007 0.08775371
7  1970 1867.918 2100.396 0.05173704 0.05643942  1248.4862 2169.640 0.05066231 0.08498707  1150.549 2186.294 0.05038610 0.08823982
8  1975 2016.554 2132.814 0.05118828 0.05302772  1827.8085 2199.954 0.05023261 0.05699570  1443.484 2219.633 0.04985585 0.07501213
9  1980 2048.701 2138.568 0.05113436 0.05246769  1920.1593 2201.945 0.05022459 0.05480756  1837.808 2223.453 0.04995022 0.05584261
10 1985 2124.600 2202.525 0.05005941 0.05111410  2015.7552 2273.314 0.04894792 0.05221503  1840.982 2353.670 0.04681022 0.05449696

from which we derive the following figures:



We can also look at the correlation between the bootstrap replicates of the K and URR coeffcients:

Scatterplot of the bootstrap replications of the URR and K coefficients for the BP data. The concentrations ellipse are drawn at 50, 90, and 99% level using a robust estimate of the covariance matrix of the coefficients

Discussion:

    1- The 95% confidence intervals for the URR and K using data from 1980 to 2004 is [1838.0 2223.0] Gb and [5.0 5.6]% respectively.
    2- Not surprisingly, the confidence intervals are strongly affected by the 70's hump in production. The 1965-1970 period constitutes a transition or shock period between two production regimes.
    3- Estimates from data starting around 1970 are fairly stable
    4- the parameter K is negatively correlated with the URR (higher K values will give lower URR)
    5- Many variations can be made, for instance in the way the robust least-square is applied

_________________
______________________________________
http://GraphOilogy.blogspot.com


Last edited by khebab on Tue Jan 10, 2006 8:07 pm; edited 3 times in total
Back to top
View user's profile Send private message Visit poster's website
khebab
Moderator
Moderator


Joined: Sep 27, 2004
Posts: 935
Location: Canada

PostPosted: Tue Jan 10, 2006 3:10 pm    Post subject: Re: Bootstrapping Technique Applied to the Hubbert Lineariza Add User to Ignore List Reply with quote

Enjoy! Smile
Code:
####################################################################
##
##  Script that illustrate bootstrapping techniques applied
##  to the Hubbert linearization of the total world production.
##
##  see the thread http://www.peakoil.com/fortopic16349.html for more details
## 
##  Author: Khebab (January, 2006)
##  version 1.0
##
####################################################################

rm(list=ls(all=TRUE))   # clean up
##
##  Load matlab emulator package and gremisc (to install)
##  in the menu bar: packages/install pacakage(s)
##
library(matlab)
library(gregmisc)
library(boot)
library(MASS)
library(car)

####################################################################
##
##              Main
##
####################################################################

#
#   World oil production since 1901 up to 2005 (from BP) in Gb (all liquids)
#
temp <- scan()
2.0000000e-001  1.6740000e-001  1.8180000e-001  1.9490000e-001  2.1800000e-001  2.1510000e-001 
2.1330000e-001  2.6420000e-001  2.8560000e-001  2.9870000e-001  3.2780000e-001  3.4440000e-001 
3.5240000e-001  3.8530000e-001  4.0750000e-001  4.3200000e-001  4.5750000e-001  5.0290000e-001 
5.0350000e-001  5.5590000e-001  6.8890000e-001  7.6600000e-001  8.5890000e-001  1.0157000e+000 
1.0143000e+000  1.0679000e+000  1.0968000e+000  1.2630000e+000  1.3250000e+000  1.4860000e+000 
1.4100000e+000  1.3730000e+000  1.3100000e+000  1.4420000e+000  1.5220000e+000  1.6540000e+000 
1.7920000e+000  2.0390000e+000  1.9880000e+000  2.0860000e+000  2.1500000e+000  2.2210000e+000 
2.0930000e+000  2.2570000e+000  2.5930000e+000  2.5950000e+000  2.7450000e+000  3.0220000e+000 
3.4330000e+000  3.4040000e+000  3.8030000e+000  4.2830000e+000  4.5190000e+000  4.7980000e+000 
5.0180000e+000  5.6260000e+000  6.1250000e+000  6.4390000e+000  6.6080000e+000  7.1340000e+000 
7.6613500e+000  8.1942500e+000  8.8877500e+000  9.5374500e+000  1.0285700e+001  1.1070450e+001 
1.2620000e+001  1.3550000e+001  1.4760000e+001  1.5930000e+001  1.7540000e+001  1.8560000e+001 
1.9590000e+001  2.1340000e+001  2.1400000e+001  2.0380000e+001  2.2050000e+001  2.2890000e+001 
2.3120000e+001  2.4110000e+001  2.2980000e+001  2.1730000e+001  2.0910000e+001  2.0660000e+001 
2.1050000e+001  2.0980000e+001  2.2070000e+001  2.2190000e+001  2.3050000e+001  2.3380000e+001 
2.3900000e+001  2.3830000e+001  2.4010000e+001  2.4110000e+001  2.4500000e+001  2.4860000e+001 
2.5510000e+001  2.6340000e+001  2.6860000e+001  2.6400000e+001  2.7360000e+001  2.7310000e+001 
2.7170000e+001  2.8120000e+001 

vWorldProduction <- matrix(temp, ncol=1 , byrow= F)


mProductionData <- cbind(cbind(c(1901:2004), vWorldProduction), cumsum(vWorldProduction));
mProductionData <- cbind(mProductionData, mProductionData[,2]/mProductionData[,3])

colnames(mProductionData , do.NULL = FALSE)
colnames(mProductionData )<- c("Year","Prod","CumProd","PQ")
frProductionData <- data.frame(year= mProductionData[, 1],  prod = mProductionData[,2], cumprod= mProductionData[,3], pq= mProductionData[,4])

head(frProductionData)
#
#   Function that apply the Hubbert linearization technique
#
boot.RLS.twoparam <- function(data, indices, maxit=20) {
   data <- data[indices,]
   mod <- rlm(pq ~ cumprod, data=data, maxit=maxit, method= "MM")
   v <- coefficients(mod)
   c(-v[1]/v[2], v[1])
}

matplot(x= frProductionData$cumprod, y= frProductionData$pq, pch = 1:4, type = "l", add= F, ylab= "P/q", xlab= "Q")
years <- c(1940,1945,1950,1955,1960,1965,1970,1975,1980,1985)
#years <- c(1940:1985)
Results.confidenceInterval <- data.frame(year= NA,
             URRMin1= NA, URRMax1= NA, KMin1= NA, KMax1= NA,
      URRMin2= NA, URRMax2= NA, KMin2= NA, KMax2= NA,
      URRMin3= NA, URRMax3= NA, KMin3= NA, KMax3= NA)
m <- 1
#
#   The bootstrapping technique is applied for different year intervals
#   from startingYear to 2004.
#
for(startingYear in years ) {
   
   idx <- find(frProductionData$year == startingYear)
   idx2 <- c(idx[1]:104)
   frSubProductionData <- data.frame(year= mProductionData[idx2 , 1],  prod = mProductionData[idx2 ,2],
             cumprod= mProductionData[idx2 ,3], pq= mProductionData[idx2 ,4]);
   # bootstrapping application
   Results.boot <- boot(frSubProductionData , boot.RLS.twoparam , 2000, maxit= 200)
   # confidence intervals calculation
   tempURR <- boot.ci(Results.boot, conf= 0.5, index= 1, type= c("perc","bca"))
   tempK <- boot.ci(Results.boot, conf= 0.5, index= 2, type= c("perc","bca"))
   tempURR2 <- boot.ci(Results.boot, conf= 0.9, index= 1, type= c("perc","bca"))
   tempK2 <- boot.ci(Results.boot, conf= 0.9, index= 2, type= c("perc","bca"))
   tempURR3 <- boot.ci(Results.boot, conf= 0.95, index= 1, type= c("perc","bca"))
   tempK3 <- boot.ci(Results.boot, conf= 0.95, index= 2, type= c("perc","bca"))
   Results.confidenceInterval[m,]= c(startingYear, tempURR$bca[4:5], tempK$bca[4:5]
      , tempURR2$bca[4:5], tempK2$bca[4:5], tempURR3$bca[4:5], tempK3$bca[4:5])
 
   plot(Results.boot, index=1)
   #jack.after.boot(Results.boot, index=1, main='URR')
   
   m <- m + 1
}
Results.confidenceInterval
write(t(Results.confidenceInterval), file="ConfidenceIntervals.txt")

_________________
______________________________________
http://GraphOilogy.blogspot.com
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   Printer-friendly version    Peakoil.com Forum Index -> Depletion Modeling All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Atom News FeedRSS 1.0 News FeedRSS 2.0 News FeedRSS Forums Feed