Comparing the Accuracy of Multiple Discriminant Analyisis, Logistic Regression, and Neural Network to estimate pay and not to pay Dividend

This study compares the accuracy of prediction to estimate the companies dividend policy; in this case, the company will pay or not pay dividends. The models used in this research are Multiple Discriminant Analysis, Logistic Regression, and Neural Network. The samples are divided into two groups, namely companies that always pay and not pay dividends during the 2015-2018 research period, resulting in 256 samples not paying dividends and 128 samples paying dividends. The results showed that the average Neural Network accuracy performance exceeded the other two models. The best predictor of the company's Dividend Policy in this study is Price to Book Value, Stock Price, Firm Cycle, current ratio, ROA and Exchange Rate.


Introduction
Dividend policy is a decision whether profits derived by a company will be distributed to shareholders as dividends or will be retained in the form of retained earnings to finance investment in the future. If the company chooses to distribute profits as dividends, it will reduce retained earnings and subsequently reduce total internal resources. Various theories were developed to explain the behavior of corporate dividend policies. These theories include A bird in the hand (Gordon, 1959); (Lintner J., 1956), Agency cost (Jensen & Meckling, 1976), Signaling (Bhattacharya, 1979); Behavioral explanation (H. M. Shefrin & Thaler, 1988)(H. Shefrin & Statmant, 1985); Life Cycle (Grullon, Michaely, & Swaminathan, 2002) and Taxation (Miller & Scholes, 1978). Various studies have also been conducted to analyze the factors that influence dividend policy. These factors include profitability, liquidity, efficiency, leverage, life cycle, ownership and size of the company (Lestari, 2018); (Franc-Dąbrowska, Mądra-Sawicka, & Ulrichs, 2020). The need to be able to predict whether a company will pay dividends or not is essential for investors to determine their investment decision. Companies that pay dividends are considered as companies that have good prospects in the future.
The method used to estimate dividend policy is divided into several types, namely multiple regression, Tobit regression, Logistic Regression, Multiple Discriminant Analysis and more recently through the Neural Network approach. Within the dividend policy framework, size forms the basis of the method to be used. In this matter, the size of the dividend policy are represented by dividend yields, dividends per share; or policies to pay or not pay dividends. In this context, Multiple regression models and Tobit regression models are used when the dependent variables are represented dividend yield and dividend per share. Where as in the case of pay or non-payment of dividends, the Logistic Regression model `98 (LR), and the Multiple Disriminant Analysis model (MDA). Various studies indicate specifically that sample selection will exclude companies that do not pay dividends. This condition affects the essence of dividend policy, where the policy of not paying dividends is a decision taken by the company. The reason in using the approach to eliminate companies that do not pay dividends from the sample, is to avoid the data distribution skewness. As statistical tool Tobit regression uses censored data namely because the use of ordinary least square in a skew data distribution will cause bias in result (Tobin, 1958); (Greene, 2000).
Neural Network (NN) is a model that follows the workings of the human brain, consisting of interconnected neurons that are used to provide solutions to various problems in various fields of science such as statistics, technology, and economics. Some studies show neural networks have more accurate predictions than other models (Abdou, Pointon, El-Masry, Olugbode, & Lister, 2012); (Laoh, 2019). This study aims to compare three models, i.e. Multiple Discriminant Analysis Model (MDA), Logistic Regression Model (LR), and Neural Network (NN), for predicting corporate dividend policy.

Theoritical Framework Multiple Discriminant Analysis
Multiple discriminant analysis (MDA) is part of a multivariate statistical analysis that aims to separate several groups of data that have been grouped by forming discriminant functions. This statistical technique is used in dependency relationships between response variables and explanatory variables. Discriminant analysis is used in cases where the response variable is qualitative data, and the explanatory variable is quantitative data (Coulombe, 1985). MDA is appropriately used to examine the company's dividend policy, which is to comprehend the distinguishing factors between companies that pay and do not pay dividends (Ismiyanti, 2005). The model of discriminant analysis is as follows:. (1) where, Z= discriminant score, α = a constant term, βi = the discriminant coefficient or weight of the variable, Xi = predictor or independent variable, i = number of predictor variables; i = 1,2,3,....k.

Logistic Regression
Logistic regression (LR) is one type of regression that connects one or several independent variables with the dependent variable in the form of a category; 0 and 1. The types of independent variables in this category distinguish logistic regression from multiple regression or other linear regression. LR models the probability of an outcome based on individual characteristics. The model of LR is as follows: (2) π is the probability of an event (e.g. paying dividends) and β1 is the regression coefficient related to the reference group while x1 is an explanatory variable (Sperandei, 2014

Machine Learning
An approach in Artificial Intelligence that is widely used to replace or simulate human behavior to solve problems or automate. As the name implies Machine Learning (ML) that tries to mimic how human or intelligent beings learn and generalize. There are at least 2 main applications in ML, namely classification and prediction. The hallmark of ML is the process of training, learning and training. Therefore, ML requires data to be studied which is called training data. Classification is a method in ML that is used by machines to sort or classify objects based on certain characteristics as humans try to distinguish objects from one another. While predictions or regressions are used by machines to guess the output of an input based on data that has been learned in training. The most popular ML methods are decision making systems, support vector machines and neural networks.

Neural Network.
Artificial neural network systems (NN) are computer algorithms that can be 'trained' to mimic neural networks in the human brain (Dorsey, Edminister, & Johnson, 1995). The network consists of a large number of basic processing units that are interconnected to process information. The results of network processing come from the collective behavior of the units and depend on how the units interact with each other (Altman, Marco, & Varetto, 1994). By processing and evaluating interactions in previously complex data sets, the neural network tries to assign the right weights to each input to enable the correct reduction of the final result. These input weights are aided by the 'genetic algorithm' optimization procedure, which simulates the predictive power of the model under a large number of scenarios and allows the best weighting scheme to survive and reproduce from one generation to the next.
NN is one of the techniques in machine learning that mimics the human nerve, which is a fundamental part of the working brain. NN consists of the input and output layers. Each layer consists of several neurons that have an activation function that determines the output of the unit. Hidden layers can be added to increase the capabilities of the NN. NN is trained using the training data. The more training data, the better the performance of the NN. The ability of NN is also limited to the number of layers, the more the number of layers, the higher the capacity of the NN.

Methodology
This study uses secondary data taken from financial statements of public companies from the manufacturing sector listed on the Indonesia Stock Exchange in the period 2015-2018. Based on the respective dividend policy two groups of companies are formed i.e companies that continue to pay dividends, and alternately companies that never pay dividends.

Research Model
This research tests the prediction accuracy of factors that determine the company's policy to pay or not pay dividends. Three models are used: Multiple Discriminant Analysis, Logistic Regression and Neural Network. As predictors 24 financial ratios consisting of liquidity, effectiveness, profitability, leverage, market ratios are used, in addition to ratios these factors relating to size, age, company life cycle and macroeconomic conditions are included. The selection of the best predictor is performed by using stepwise discriminant `100 analysis in MDA (Hertz et al, 1991). In the process, all predictors are entered into the model and then removed gradually if not statistically significantly influences discrimination. In LR model, the selection of the best predictor is performed using the forward (conditional) method. Whereas on neural networks the choice of predictors is based on the MDA model and LR model.

Comparison of Prediction Accuracy
Comparison of the accuracy of predictions the company hat pay or not pay dividends from the three models using three metrics namely: accuracy, precision, and sensitivity (Devi & Radhika, 2018). There is a fourth measure, namely specificity. However, as the sample size of the two groups of companies is not the same renders the specificity results not being relevant in this research. The calculations of those metrics are based on the classification of comparison between the predicted results and observations data for each model. The outcome of that classification are groups of samples namely: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) calculations. The formula for the three metrics is as follows: (3) ' (4) (5)

Result and Discussion Statistics Descriptive
Data in table 1 shows descriptive statistics of 2 groups of companies that pay and do not pay dividends. The data shows that companies that pay dividends tend to be more liquid, profitable, have higher prices, and are in a cycle where retained earnings against more assets. Even so, the company that pays dividends are companies with a lower-Price to Book Value, and occurs when the Rupiah exchange rate against the US Dollar tends to weaken. MDA test results show a significant difference in the group of companies that pay dividends and those that do not pay dividends, indicated by the output value of Wilks Lambda, which is below 0.05. The best five predictors are obtained by the stepwise method. Those predictors are natural logarithm of price Price to Book Value, Current Ratio, Return On Assets, Exchange Rate, and Retain Earning to Total Assets. The canonical value is 0.692, which means 77.51% of the variance of the independent variables can explain the discriminant model that is formed. The following is a dividend policy model based on MDA: Z = 8,153 + 0.325lnprice -0.08PBV + 0.02CR + 0.43 ROA -0.01ER + 0.323RETA (6) with a cutting score of -0.513 for companies that pay dividends and 1,047 for companies that do not pay dividends. Table 2 shows the classification results of the percentage of cases predicted accurately The logistic regression model is used to classify the company's dividend policy, whether the company will pay or not pay dividends. Logistic regression analysis examines the relationship between the dependent variable (dividend policy) and a group of independent variables that will provide predictive results regarding the probability of companies paying dividends. Forward Conditional method was performed to obtain the best predictors as follows in equation 7 : The value of Nagelkerke R Square is 0.641, which means 64,1 % of the variation of the dividend policy is explained by model. The value of Hosmer and Lemeshow test is 0.10 which means the model is fit. The classification of predictive and observed value of LR model can be seen in Table 3. The dividend policy was also analyzed by applying the Neural Network. In this model, the research samples are divided into 302 training samples and 78 testing samples. The network has five inputs where each input node represents one variable, one hidden layer with three neurons. This model obtains an incorrect prediction for the training sample of 14.2% and 11.5% for the testing sample. Based on the neural network, the best predictors are ROA, PBV, RETA, Price, and Exchange Rates, respectively. The following table is the result of a predictive classification of the NN:

Comparison of Accuracy Performance
Accuracy represents comparison between the percentage of cases that are classified correctly. Accuracy is usually used as an estimate of classification performance. NN model has an accuracy performance result above the other two models. For training samples, the value of accuracy level is 85.76% while for tests samples it reaches 88.46%. As a comparison, the MDA model has an accuracy rate value of 79.69% while the LR model has a slightly better accuracy level value of 80.79%. The next metric is precision which is the percentage of cases that are classified positively, as seen in figure 1.

Comparison of Precision Performance
In the precision metric, the Neural Network model's prediction accuracy performance marginally exceeds that of MDA and LR, the value of precision obtained from training `103 samples is 73.81% and 74.07% test samples, while comparison precision value obtained from MDA models is 73.58%, and 73.64% for LR model. The next metric is sensitivity, defined as the ratio of the number of true negative cases to the total number of negative cases [23], the comparison of sensitivity of the three models can be seen in Figure 2 as follows,

Comparison of Sensitivity Performance
Similar to the previous two metrics, the sensitivity performance of neural networks it also exceeds MDA and LR models. Both Neural Network samples have a sensitivity performance of 90.29% for training sample and 90.91% for the test sample, significantly greater than the MDA performance of 60.94% and the LR value of 64.80% respectfuly.

Figure 3. Comparison of Sensitivity Performance
Overall average prediction Performance of the three metrics i.e. accuracy, precision and sensitivity can be seen in Figure 4. Neural Network has the average value of predictive accuracy performance are `104 83.29% for sample training and 84.48% for testing sample. The average predictive performance for MDA is 71.48% and 73.22 for LR

Conclusion
This study uses data taken from public companies financial statements, which shares listed on Indonesia stock exchange in the period of 2015 -2012, taken from Indonesia Stock Exchange. MDA shows that the characteristics of companies that pay and not pay dividend are differentiated by price to book value, stock price, firm cycle, current ratio, ROA and Exchange rate. We used these variables as predictors both in LR and NN. By comparing the three metrics we conclude that neural network produce better model compare to MDA & LR. The average performance of Accuracy, Precision and Sensitivity in predicting wether the company pay or not to pay dividend is the best in Neural Network Model.