It is believed that stock prices are impacted by three factors.

- Company/Industry performance
- Macroeconomic outlook
- Sentiment-driven by news

The first two of the above are defined by numerical data. I decided to build an artificial engine that can predict the value of a particular stock based on movement in the above variables. We looked at following variables.

**assetsWithBankingSystem**– Total assets with the banking system**bankCredit**– Bank credit in India**cash**– Cash in hand**investmentAtBookValue**– Total bank investments at book value**liabilitiesToBankingSystem**– Total liabilities of banks to the banking system**liabilitiesToOthers**– Total liability of banks other than the banking system**curcredit**– Current account credit in INR**curdebit**– Current account debit in INR**capcredit**– Capital account credit in INR**capdebit**– Capital account debit in INR**errcredit**– Errors credit**errdebit**– Errors debit**balcredit**– Balance credit**baldebit**– Balance debit**monmovcredit**– Monetary movements credit**monmovdebit**– Monetary movements debit**callMoneyHigh**– Call money rate, High**callMoneyLow**– Call money rate, Low**eps**– Earning per share of the company**ceps**– Cash earning per share of the company**bookValue**– Book value of the company**div**– Dividend paid per share of the company**opProfitPerShare**– Operating profit per share of the company**netOperatingIncomePerShare**– Net operating income per share of the company**freeReserves**– Free reserves with the company**opm**– Operating profit margin of the company**gpm**– Gross profit margin of the company**npm**– Net profit margin of the company**ronw**– Return on net worth of the company**debtToEquity**– Debt to equity ratio of the company**currentRatio**– Current ratio of the company**quickRatio**– Quick ratio of the company**interestCover**– Interest cover of the company**salesByTotalAssets**– Sales by total assets of the company**salesByFixedAssets**– Sales by fixed assets of the company**salesByCurrentAssets**– Sales by current assets of the company**noOfDaysOfWorkingCapital**– No of days of working capital with the company**cpi**– Consumer price index**br**– Bank Rate**idbiRate**– IDBI minimum term lending rate**maxCMR**– Maximum Call Money Rate**maxPLR**– Maximum prime lending rate**minPLR**– Minimum Prime lending rate**price**– Crude price**totalINRdebt**– Total debt in Indian Rupees**concessionalDebtAsPercOfTotal**– Concessional debt as a percentage of total**shortTermDebtAsPercOfTotal**– Short-term debt as a percentage of total**affConstant**– Agriculture, Forestry and Fishing, GDP factor cost, Constant prices**affCurrent**– Agriculture, Forestry and Fishing, GDP factor cost, Current prices**cspsConstant**– Community social and personal services, GDP factor cost, Constant prices**cspsCurrent**– Community social and personal services, GDP factor cost, Current prices**consConstant**– Construction, GDP factor cost, Constant prices**consCurrent**– Construction, GDP factor cost, Current prices**egwsConstant**– Electricity, Gas and Water Services, GDP factor cost, Constant prices**egwsCurrent**– Electricity, Gas and Water Services, GDP factor cost, Current prices**firebsConstant**– Finance, Insurance, Real Estate, and Business services, GDP factor cost, Constant prices**firebsCurrent**– Finance, Insurance, Real Estate, and Business services, GDP factor cost, Current prices**manuConstant**– Manufacturing, GDP factor cost, Constant prices**manuCurrent**– Manufacturing, GDP factor cost, Current prices**maqConstant**– Mining and quarrying, GDP factor cost, Constant prices**maqCurrent**– Mining and quarrying, GDP factor cost, Current prices**tdpConstant**– Total domestic product, GDP factor cost, Constant prices**tdpCurrent**– Total domestic product, GDP factor cost, Current prices**thrConstant**– Trade, Hotel and Restaurant, GDP factor cost, Constant prices**thrCurrent**– Trade, Hotel and Restaurant, GDP factor cost, Current prices**aff**– Agriculture, Forestry and Fishing, GDP factor cost**csps**– Community social and personal services, GDP factor cost**cons**– Construction, GDP factor cost**egws**– Electricity, Gas and Water Services, GDP factor cost**firb**– Finance, Insurance, Real Estate, and Business services, GDP factor cost**manuf**– Manufacturing, GDP factor cost**min**– Mining, GDP factor cost**tdp**– Total domestic product, GDP factor cost**thr**– Trade, Hotel and Restaurant, GDP factor cost**currencyWithPublic**– Total currency with Public**m3**– Money supply, also referred to as stock of legal currency in the economy**timeDepositsWithBank**– Total time deposits with the bank**totalIncome**– Total income of RBI**totalExpenditure**– Total expenditure of RBI**netAvailableBalance**– Net available balance in RBI**surplusToCentralGovernment**– Surplus payable to central government from RBI**totalIssuesLiabilities**– Total liabilities, Issues**totalIssuesAssets**– Total assets, Issues**totalBankingLiabilities**– Total liabilities, Banking**totalBankingAssets**– Total assets, Banking**reserveMoneyLiabilities**– Reserve Money, Liabilities**reserveMoneyAssets**– Reserve Money, Assets**forwardCashSpot**– Forward Cash Spot, USD forward premia**forwardCashOneMonth**– Forward Cash one month, USD forward premia**forwardCashThreeMonth**– Forward Cash three months, USD forward premia**forwardCashSixMonth**– Forward Cash six months, USD forward premia**forwardCash12Month**– Forward cash twelve months, USD forward premia**referenceRate**– RBI reference rate for USD**rate**– US interest rate**quantitiy**– Quantity of particular stock traded**turnover**– Total turn over of stock traded

We collected the data for the above metrics and established their relationship with the following data specific for a stock.

- Previous day close
- Day open
- Day high
- Day low
- Day close

Since there are a very large number of input variables related to economic indicators which may have a heavy correlation between themselves, the factor analysis was used to reduce the features to a manageable set of features that were used as inputs for the neural network later to develop the prediction model. For each company, four models were constructed as follows.

- 1D model, which would predictions the prices for next day given the stock price, turnover, and quantity for a day earlier to the previous day.
- 7D model, which would make predictions given the stock price, turnover and quantity for a week earlier
- 15D model, which would make predictions 15 days down the line.
- 180D model, which would make predictions six months down the line given the
- stock price for a day.

After the factor analysis of the data, 96 inputs are reduced to 20 inputs with 95% of the variance explained. These factors are as follows. As we go to later factors, these mostly cover the residual values from initial factors.

- Factor 1 – RBI influence and Core sector
- Factor 2 – Foreign Exchange and Crude
- Factor 3 – Agriculture, Total Domestic Product
- Factor 4 – Company Financials
- Factor 5 – Company Ratios
- Factor 6 – Agriculture, Community services, debt structure with RBI
- Factor 7 – Company Capital structure, profitability ratios, and other indicators
- Factor 8 – Banking system residuals
- Factor 9 – Company Liquidity Ratios
- Factor 10 – Company stock performance
- Factor 11 – RBI balance sheet debt structure and errors
- Factor 12 – RBI balance sheet errors
- Factor 13 – Company indicators (residuals)
- Factor 14 – Banking system residuals
- Factor 15 – Company financial ratios, Residuals
- Factor 16 – Foreign Exchange, Crude and interest rate, Residuals
- Factor 17 – Company Financial Ratios, Residuals part 1
- Factor 18 – Company Financial Ratios, Residuals part 2
- Factor 19 – USD Forward Spot rate
- Factor 20 – IDBI lending rate and crude prices

The companies in NSE-50 index were considered.

## Design of neural networks

### Inputs and outputs

The economic indicators for model related to the company have been factored into 20 factors that explain most of these numbers. Additional 3 inputs are company specific and are related to the past stock price data with respect to that company.

- Previous Close
- Previous Turn Over
- Previous Quantity

These makeup for the 23 variables that are used as inputs for the neural network. Three different

neural networks are used for the following three output variables

- High
- Low
- Close

### Hidden Layers

It is assumed given the richness of the data that at least 2 hidden layers would be required to form a meaningful neural network. The neural network will have 23 inputs and will have 1 output. Different neural networks would be created and a training run would be performed for 1500 cycles of data set. At the end of the sample run, the best network would be chosen for further training.

Neural networks that were evaluated are with

- 1 input layer with 23 inputs
- first hidden layer with nodes 31 to 351
- second hidden layer with nodes 8 to 31
- 1 output layer

The neural network with hidden layer 1 of 130 nodes and hidden layer 2 of 17 nodes comes with best error values to be further used. I have used libraries provided by Joone. Following are the main fragments of code for this exercise.

## Results

Prediction Six Months |

## Conclusion

This exercise concludes that there is merit to using neural networks in trying to understand and predict the behavior of markets but with a certain caution. Following are important points to be kept in mind if this model is used for investment decisions.- The model does not return profitable results in very short duration trades, the investor should have an investment horizon of more than 6 months for the model to work properly
- The model does not guarantee that all the trades would be profitable but overall there is a better chance of profits
- Stocks with less volatility perform better in model-based prediction

For more details please refer to the report attached below.