TREND LINE AND CASH-FLOW FORECAST BUDGET AUDIT FOR THE DOW JONES U.S. OIL & GAS INDEX (^DJUSEN) & THE CONOCOPHILLIPS (COP) UNIVARIATE & MULTIVARIATE LINEAR & NONLINEAR MODELING.docx
TREND LINE AND CASH-FLOW FORECAST BUDGET AUDIT FOR THE DOW JONES U.S. OIL & GAS INDEX (^DJUSEN) & THE CONOCOPHILLIPS (COP): UNIVARIATE & MULTIVARIATE LINEAR & NONLINEAR MODELING
Introduction
This is a study of trend line and cash-flow forecasts. This study creates a budget audit for the Dow Jones U.S. Oil & Gas Index (^Djusen) & The Conocophillips (Cop). This study tests the univariate & multivariate linear & nonlinear modeling techniques, to determine which one is the best in terms of accuracy of statistical measures such as the correlation between actual and predicted values. Likewise, it conducts hypothesis testing, to see whether the null hypothesis of no correlation can be rejected in favor of an alternate hypothesis. Further, traders may consider integrating such analysis with accounting software such as Microsoft Office Accounting (MOA), as well as spreadsheet analysis, such as Microsoft Excel using test and graphics methods.
This study develops an automated forecasting trend analysis based web query. Such method will facilitate proper cash management and integration with Excel accounting software, and a professional trading software such as Wealth-Lab, Trade-Station (Easy Language) etc...
Visual Graphic Trading Strategy Matching the Poly and the Linear Trend Line to Time the Market
--- Chart 2a insert here
Table 2a shows the daily Cop as of Monday, November 19, 2007. Chart 2A Open-Hi-Lo-Close (OHLC) Stock Market Prices shows the Max (peak) and Min (valleys) of the Poly function. Ideally, a trader should trade in the congruent areas. The congruent areas are areas where the Linear and the Poly functions are moving in the same direction. Since the general direction of the linear function is down during October, November, the trader should short the stock during the down range of the Poly function. For example, starting on Tuesday October 15-26, or Monday November5-17, is a cycle of about a week. Wait a day, borrow the stock at $87.38, wait a couple of drops, and exit the short trade at about 81.26 on Wednesday October 24, 2007.
Go Short for a Down Linear Trend or Long for an Up Trend Matched to the Poly Function
This short trade will gain more than 3% within a span of a week, 12 Month x 4 weeks = 48 weeksx3%= 144% annual ROI, compared to 5% at the bank. Univariate Forecasting Model Trend Line OHLC Chart Compared to Multivariate Models Univariate Forecasting Model Trend Line OHLC Chart Compared to Multivariate Models. The univariate model involves only a single predictor independent variable (X), such as the date, predicting the predicted, dependent variable, the stock market value (Y), Open, in this case, or the Adjusted Close, of the Open, High, Low, Close (OHLC) chart. In contrast, the multivariate model will include more than 1 predictor variable such as Open, High, Low, Close, Volume, in addition to Date. The additional variables will raise the complexity but may also enhance the reliability as the R-square measures.
Polynomial (Poly) Model a Univariate Non-linear Forecasting Model Trend Line OHLC Chart
Polynomial Model a Univariate Non-linear Forecasting Model Trend Line describes the relationship between the Julian date and the stock market values, along with an Open-High-Low-Close (OHLC) stock chart. Dow Jones U.S. Oil & Gas Index (^DJUSEN) is an industry index of stocks in of companies such as ConocoPhillips (COP). This study investigates which forecasting method is most accurate for daily stock quote forecast, for a single industry and a single company. Our assessment reveals that consistently the Polynomial Trend Line (Poly) is the most accurate model based on its R-Square value.
Data Analysis and Data Sources
Data has been retrieved from Yahoo daily historical quotes: http://finance.yahoo.com/q/hp?s=COP . Using Excel Web Query, the data can be updated automatically and recalculated on a regular basis after market hours, so it will be updated, when the trader is ready to trade. Once the data has been refreshed, the spreadsheet recalculates all the models automatically, reproducing updated analysis.
--- Table 1 insert here
Table 1 shows the daily Dow Jones U.S. Oil & Gas Index (^DJUSEN) as of Monday, November 19, 2007, from Yahoo financial, historical quotes. Following are the models, rank, algorithms, R-square, and the functions:
|
Model |
Rank |
Algorithm |
R-Square |
Function |
|
Monday, November 19, 2007 |
|
||
|
1 |
1 |
Poly |
R˛ = 0.693 |
y = 3E-07x6 - 0.065x5 + 6487.x4 - 3E+08x3 + 1E+13x2 - 2E+17x + 1E+21 |
|||||
|
2 |
2 |
Log |
R˛ = 0.293 |
y = -2116ln(x) + 22459 |
|
|
|
|
|
|
3 |
3 |
Expo |
R˛ = 0.287 |
y = 1E+17e-8E-0x |
|
|
|
|
|
|
4 |
4 |
Linear |
R˛ = 0.281 |
y = -0.534x + 21685 |
|
|
|
|
|
|
5 |
5 |
Power |
R˛ = 0.224 |
y = 2E+12x-27.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Log. Expo, Linear & Power Model a Univariate Forecasting Models Trend Lines OHLC Chart
The Model, Rank, Algorithm, R-Square, and Function show how model 1 ranks as number 1, the top ranked model is the Poly. The highest R˛ = 0.693, explains why we rank this function, y = 3E-07x6 - 0.065x5 + 6487.x4 - 3E+08x3 + 1E+13x2 - 2E+17x + 1E+21, as the best model. The next model number 2 is ranked as number 2, which is the Log. The number 3 rank is Expo; Linear follows with number 4, and Power the last one with number 5.
2 2 Log R˛ = 0.293 y = -2116ln(x) + 22459
3 3 Expo R˛ = 0.287 y = 1E+17e-8E-0x
4 4 Linear R˛ = 0.281 y = -0.534x + 21685
5 5 Power R˛ = 0.224 y = 2E+12x-27.0
ConocoPhillips (COP) an Oil and Gas Industry Company Ranks Poly as the Top Model
Just like the industry index the ConocoPhillips (COP) An Oil and Gas Industry Company Ranks Poly as The Top Model. Again, the Poly ranks number 1, even as the other models follow with lower rankings. This confirms our initial notion that the Poly is the best Univariate Forecasting Model. However, it is also more complicated than the linear models; therefore, many traders are using the linear models, especially to calculate budget numbers for accounting purposes.
--- Table 2 insert here
Table 2 shows the same picture for the individual company, as we saw for the industry index, namely that the Poly is a superior univariate stock market forecasting model. Furthermore, we may be able to see the point of entry into a trade after the Poly trend line reaches a minimum point on the Monday, turning upward for the next day. Following are the models for the individual company:
No. Rank Model R**2 Function
Table 2a
1 1 Poly R˛ = 0.88 y = 1E-07x6-0.033x5+3290.x4-2E+08x3+5E+12x2-8E+16x+5E+20
2 2 Log R˛ = 0.678 y = -7265ln(x) + 76958
3 2 Linear R˛ = 0.678 y = -0.184x + 7349
4 3 Expo R˛ = 0.672 y = 2E+39e-0.00x
5 3 Power R˛ = 0.672 y = 2E+30x-86.2
Typically, COP demonstrates the pattern of the entire industry, the Poly model is the best univariate forecaster.
Table 2b Cash-flow Forecast
|
|
|
|
Tuesday, November 20, 2007 |
|
|
|
Budget |
=FORECAST($A$3+1,$G$3:G67,$A$3:$A$67) |
|
|
|
Budget |
Sale |
$ 83.70 |
Wednesday, November 21, 2007 |
Close |
|
Budget |
Cost |
$ 79.29 |
Tuesday, November 20, 2007 |
Close |
|
Budget |
Gain |
$ 4.41 |
1 Day Lag Budget Gain |
|
|
|
=IF(K12>0,"Gain","Loss") |
|
||
|
|
Actual |
|
Wednesday, November 21, 2007 |
|
|
Actual |
Sale |
$ 77.82 |
Close Wednesday |
|
|
Actual |
Cost |
$ (78.67) |
Open Wednesday |
|
|
Actual |
Loss |
$ (0.85) |
=IF(K17>0,"Gain","Loss") |
|
|
Notice, the A3+1, in the formula indicates that we are forecasting 1 day into the future, =FORECAST($A$3+1,$G$3:G67,$A$3:$A$67) |
|
On Tuesday, November 20, 2007, a trader may use the Excel Forecast function to budget an entry into a trade in the following way. Starting with a budget of Budget=FORECAST($A$3+1,$G$3:G67,$A$3:$A$67) leading to a Budget Sale value of $ 83.70 as of Wednesday, November 21, 2007 Close, less than the Budget Cost of $ 79.29, as of Tuesday, November 20, 2007 Close, producing an expected Budget Gain of $ 4.41, which may be a 1 Day Lag Budget Gain, using the =IF(K12>0,"Gain","Loss") formula to classify the result.
On Tuesday, November 20, 2007, the trader creates a micro budget using the “Budget =FORECAST ($A$3+1,$G$3:G67,$A$3:$A$67)”. The cash-flow forecast for a single share (Micro forecast) is the corner stone for each budget. It included the “Budget Sale $ 83.70 for Wednesday, November 21, 2007, Close, less the “Budget Cost $ 79.29, for Tuesday, November 20, 2007, Close.” This cash inflow less cash outflow produces, the cash balance and the budgeted profit, “Budget Gain $ 4.41 1 Day Lag Budget Gain,” using the MS-Excel formula “=IF(K12>0,"Gain","Loss").”
The “Actual” on Wednesday, November 21, 2007, with “Actual Sale $ 77.82 Close” on Wednesday, with “Actual Cost of $(78.67), Open Wednesday” results in an “Actual Loss $ (0.85),” using the MS-Excel formula of”=IF(K17>0,"Gain","Loss").” Notice, the resulting loss, is due to forecasting errors.
Overstating the Forecasted Sales Values the Trade End up Losing Instead of Gaining
Due to overstating the forecasted sales values, the trade ends up losing instead of gaining as follows:
Actual Wednesday, November 21, 2007
Actual Sale $ 77.82 Close Wednesday
Actual Cost $ (78.67) Open Wednesday
Actual Loss $ (0.85) =IF(K17>0,"Gain","Loss")
Lower Risk Trade - Linear & Poly Trend Line Point at the Same Direction up or down
One of the problems of the present trade is that the Linear and Poly trend lines disagree. Namely, the linear line goes down, while the Poly line goes up. A lower risk trade will be a trade, where the Linear and the Poly trend line, all point at the same direction, either up or down, but at a directional agreement. What is the conclusion? Buy a stock whose linear trend line and the Poly trend line point at the same direction, up. Or go short, on a stock that both the Linear and Poly trend lines go downwards.
To improve the accuracy of the forecast the trader may use a multivariate instead of a univariate regression model, especially in case of incongruent Linear and Poly trend lines.
--- Table 3b insert here
Table 3b shows the multivariate linear regression hypothesis testing analysis with the following results:
Hypothesis Testing: Rejecting the Null Hypothesis in Favor of the Alternate Hypothesis
|
Table 3b |
|
|
|
|
|
|
|
ANOVA |
|
|
|
|
|
|
|
|
df |
SS |
MS |
F |
Significance F |
|
|
Regression |
6 |
468.2863161 |
78.04772 |
43.821931 |
8.99806E-20 |
|
Table 3b ANOVA, with df (Degrees of Freedom), SS (Sum of Squares), MS (Mean Square), F Significance F, as they apply to the "Regression," with values of 6, variables: 468.2863161, 78.04772, 43.821931, 8.99806E-20, which is less than .05 or 5%. Therefore, the traders reject the null hypothesis in favor of the alternate.
Null hypothesis: the R-Square is equal to zero or R-Square=0
Alternate hypothesis: the R-Square is greater than zero or R-Square>0.
Obviously, in this case the R-Square is greater than zero.
--- Table 3c insert here
Table 3c shows the multivariate linear regression analysis with the following results:
It shows the SUMMARY OUTPUT, with the Regression Statistics, including the Multiple R 0.905138732, and the R Square 0.819276123, which is higher than both the univariate Poly models. Likewise, it includes the ANOVA (Analysis of Variance) and the Coefficients used in the forecasting function including the following:
Intercept 257.1520643
Date -0.00631914
Open 0.231859829
High 0.481689606
Low -0.16719762
Close 0.354160436
Volume -9.65562E-08
--- Table 3d insert here
Table 3d shows the multivariate linear regression residuals with the following results:
RESIDUAL OUTPUT, PROBABILITY OUTPUT, Observation Predicted Adj Close*, Residuals, Standard Residuals, Percentile, Adj Close*, with the 1st record, record number 1, has the values of 77.88901506, 1.400984942, 1.102745015, 0.769230769, and 77.13. Most importantly it includes the Adj Close (Adjusted Close) forecast of $77.13, which is much more accurate than the Poly model:
|
Adj Close* |
|
77.13 |
--- Table 4a insert here
Table 4a shows the multivariate linear regression results applied to forecast the next day stock market value with the following results:
|
Tuesday, November 20, 2007 |
|
|
|||
|
ConocoPhillips (COP) (39407=date**) |
|
|
|
||
|
Wednesday 22, 2007 |
Coefficients |
*Values 4 |
Monday, November 19, 2007 |
||
|
item |
Data |
Intercept |
257.1520643 |
257.152064 |
|
|
1 |
39407 |
Date |
-0.00631914 |
-249.01834 |
=M12*O12 |
|
2 |
77.54 |
Open |
0.231859829 |
17.9784111 |
=M13*O13 |
|
3 |
79.87 |
High |
0.481689606 |
38.4725488 |
=M14*O14 |
|
4 |
77.54 |
Low |
-0.16719762 |
-12.964503 |
=M15*O15 |
|
5 |
79.29 |
Close |
0.354160436 |
28.081381 |
=M16*O16 |
|
6 |
2E+07 |
Volume |
-9.6556E-08 |
-1.4860583 |
=M17*O17 |
|
|
Forecast |
Adj Close* |
|
78.22 |
=SUM(P11:P17) |
|
|
Actual |
|
|
77.82 |
='Cop-Wed'!G4 |
|
|
Error |
Over |
|
0.40 |
=P18-P19 |
|
|
%Error |
%Over |
|
1% |
=P20/P19 |
|
|
|
=IF(P21<0,"%Under","%Over") |
|
||
|
Cash-flow Forecast |
|
|
|
|
|
|
Shares |
Acc |
Budget |
Actual |
Variance |
%Variance |
|
1 |
Sale |
78.22 |
77.82 |
(0.40) |
|
|
1 |
Cost |
77.13 |
77.13 |
0.00 |
|
|
1 |
Gain |
1.09 |
0.69 |
(0.40) |
|
|
2 |
%Gain |
1% |
1% |
100% |
|
|
|
=IF(N26>0,"%Gain","%Loss") |
|
|
||
|
10 |
Total |
$10.86 |
$6.90 |
($3.96) |
|
|
1000 |
Total |
$1,085.51 |
$690.00 |
($395.51) |
|
|
|
|
|
|
|
|
Notice, Microsoft Excel (MSExcel) records date as a Julian date (39407=date**), namely the number of days, minutes, seconds, etc, since a given date, which is the start of counting. Excel can reformat such a Julian date in a variety of formats, such as “Wednesday 22, 2007.” Internally, however, this is a number of days. Thus, this number of days can be used as an input to a regression analysis, or a forecast trend line.
The forecast enables a traded to create a cash-flow forecast budget, with a micro variance analysis. Such variance analysis is based on 1 share only, which can be easily multiplied by the number of shares and scaled up to 10 shares, 1,000 shares, etc.
Variance Analysis Comparing Budgeted Less Actual Performance with Favorable or Unfavorable Results
The cash-flow forecast shows the number of "Shares, Acc (Account),Budget Actual Variance, %Variance, 1" line includes" Sale, 78.22, 77.82,(0.40)," which is an unfavorable variance, since actual profit is lower than budgeted profit. Likewise, it scales to 10 shares, and then to 1,000 shares with a total of ($395.51), and an unfavorable variance of -36%.
--- Table 4 insert here
Table 4a shows the multivariate linear regression lag of 1 period applied to forecast the next day stock market value with the following results:
|
Table 4 |
|
|
|
|
|
|
|
=CORREL(I4:I68,J4:J68) |
= (K5:K68,I5:I68) |
|
|
ConocoPhillips (COP) |
77.54 |
79.87 |
77.54 |
79.3 |
15,390,600 |
|
|
=E3-B2 |
-2% |
99% |
|
Date |
Open |
High |
Low |
Close |
Volume |
Adj Close* |
A close-Close |
Gain(Loss) |
Days |
Gap |
|
Monday, November 19, 2007 |
78.55 |
79.24 |
77.09 |
77.1 |
11,041,800 |
79.29 |
$2.16 |
$1.75 |
|
2% |
Lag Data Matched to Number of Periods Forecasted Into the Future 1 Day Lag for Next Day Forecast
Table 4 shows the daily stock quotes and the correlations, =CORREL(I4:I68,J4:J68),= (K5:K68,I5:I68), between vectors of lagged stock quotes. As the traders want to forecast the next day price, the traders lag the data 1 period, matching the date of Monday, November 19, 2007, with the data of the prior Friday ($78.55 Open), shifting the Monday Open of $77.54, above where it is excluded from the regression model. Notice, that the correlation between the unlagged and the lagged is 99%, which is extremely high, indicating that the lag has a very low impact on the forecast.
For (COP) the shifted foreword Friday data to Monday includes the following values: 77.54, 79.87, 77.54, 79.3, and 15,390,600, followed by the calculations of some of the columns: =E3-B2. These numbers represent the following variables: Date, Open, High, Low, Close, Volume, and Adj Close*. This record structure repeats itself for the next day. As the trader tries to forecast a larger number of days into the future, the lag may have to be increase to include more than 1 day. Thus, it may become more and more significant.
SUMMARY, CONCLUSIONS, & IMPLICATIONS
Summary
This study demonstrated the superiority of the polynomial model compared to other univariate forecasting models. It shows that to minimize risk, traders should try to pick companies, whose polynomial and linear univariate trend line move in the same directions. Otherwise, as in this case, the traders have a higher probability of experiencing forecasting errors and trading losses.
Likewise, this study demonstrated that the multivariate model is usually superior to the equivalent univariate model. In this case, the study compared the univariate linear regression, to its counterpart the multivariate regression analysis.
This study tested multiple levels of lagging the data depending on the number of days into the future for which the forecast applies.
Conclusion
Find companies whose linear and polynomial trend line move in the same direction. Verify that the trends have been established visually and statistically. Produce a budget, and most importantly back-test the model, prior to applying the strategy to an actual trade, otherwise, the trader may take too much risk, and loose accordingly.
Test the model statistically to see whether the null hypothesis of no correlation can be rejected in favor of the alternate hypothesis, and proceed based on the results to refine and improve the model.
Implications
This study has to be repeated over time with additional variables to test, as well as additional industries and companies. Testing should be repeated for short selling, as well as “going long” depending of the forecasted trend lines. Such model has to be repeatedly tested with multiple entry and exit points of an equity trade. Then, it can be replicated for fixed income, options, currencies, futures and other markets.
Even if the polynomial univariate model is superior to the linear model, by its R-square, the linear model is simpler, and more easily understood. Therefore, the best of both worlds is to apply all of them to take advantage of the strength of all models.
Further, traders may consider integrating such analysis with accounting software such as Microsoft Office Accounting (MOA). Traders should integrate the micro budget for 1 share, and scaling it up to a larger number of shares. Likewise, traders should use spreadsheet analysis, such as Microsoft Excel using test and graphics methods, as well as professional trading software such as Trade-station, Wealth-lab, etc.
To maximize automation while minimizing complexity, the trader should use Excel’s Web Query function, to automatically update the numbers, on a Daily basis. Likewise, maximize the use of Excel functions, as opposed to using VBA (Visual BASIC for Application) to calculate the trend lines, the forecasts, etc. For example, use the multiple linear regression function, instead of the add-on regression routine (Analysis Tool Pak, Regression Analysis). Thus, for example, use the =LINEST(D2:D13, C2:C13,,TRUE), producing the Slope (m), y-intercept (b), and the other LINEST( ) Statistics, such statistics as m, b, se, seb, the F Statistic, R-square etc. This method will maximize efficiency, stability, reliability and speed. Finally, after all the built-in functions have been fully used, to complete the Excel automation, the developer has to apply VBA, to complete the integration of trading software, accounting software, and Excel.
Chart 2A Open-Hi-Lo-Close (OHLC) Stock Market Prices
