Factors Affecting the Punctuality of Logistics Services Using Binary Logistic Regression

. The Indian logistics industry is in a period of rapid development. The purpose of this study is to analyze the factors affecting the punctuality of trucks for logistics distribution services in India based on data analysis. The data comes from 6880 pieces of data from the VTS data platform. A binary regression model was established for data analysis, and the model was tested to ensure statistical significance. Binary logistic regression was used to examine truck types, geographic locations, commodity types, order types, and suppliers for logistics services to examine significant differences between on-time and late. Binary logistic regression concluded that the shorter the distance, the signing of the perfect contract, and the selection of a specific type of vehicle, the better development of the logistics level is easier to be on time. This study provides suggestions for improving management strategies, which can be used as a reference for the development of logistics companies.


Background
India is one of the most populous countries in the world, and various industries are growing very rapidly. India's logistics industry is also developing rapidly, which is the focus of attention. Worldrenowned logistics service companies have established logistics networks in India. Since 2017, the compound annual growth rate has remained above 10.5%, and the market size in 2020 will exceed 215 billion US dollars [1]. The industry has become a strong support for the current economic growth of India. With the development of e-commerce platforms, more industries and competitors are joining the logistics network, and the quality of logistics services and customer satisfaction are becoming more and more important.

Related research
With regard to logistics services and punctuality, scholars from various countries have conducted research from many different perspectives.
Some researchers have studied the relationship between truck drivers' services and user purchase behavior, such as Claye-Puaux et al. (2019) discussed the relationship between truck drivers and work: 34% of truck drivers do not like their profession and do not want to maintain customer relationships [2]. They refuse to build better customer relationships and only keep their jobs at a minimum. This could result in consumers with longer and more complex addresses and contact details being considered last in the same circumstances. A more in-depth study of this issue provides an in-depth analysis of how truck drivers influence customer buying behavior in the supplier-customer relationship [3]. The service quality of truck drivers will directly affect consumers' purchasing behavior. The degree of influence will also weaken as the size of the buyer's enterprise increases.
Some scholars have studied and analyzed the evaluation system of logistics services. Wang (2019) proposed an improved method for existing evaluation metrics based on the SERVQUAL scale [4]. Akıl & Ungan (2021) analyzed e-commerce data and structural equation modeling and found that the accuracy and timeliness of orders have a positive impact on customer satisfaction and customer loyalty [5].
In addition to studying the influencing factors of logistics services, scholars have also studied the impact and changes of the times and the environment on the logistics industry. Mehrolia & Solaikutty (2021) used logistic regression to analyze the threat to food delivery services during COVID-19. Combined with the theory of self-protection, user behavior is summarized and analyzed [6]. Bookbinder & Springer (2013) analyzed the opportunities and challenges faced by Asian logistics development and discussed in detail how developing countries should develop their own logistics systems [7].
The purpose of the current research is primarily to examine the influence of the entire logistics service, the relationship between customer happiness and the quality of the logistics service, and the research on the current quality evaluation system. Research on the relationships between influencing factors and the quality of logistics services is lacking. The study of the variables influencing the standard of logistics services in India mostly focuses on firm size and consumer groups. For example, Arlbjørn (2010) found that the economic level is the determinant of the development level of the express delivery industry [1]. Abraham (2011) used the value-added model analysis method to find that the age, gender, and transportation mode of the main consumer groups will have an impact on the express delivery industry [8].
Existing research has not clearly defined the influencing factors of logistics service quality, and is only limited to the exploration of influencing factors and the research on the evaluation system of logistics service. relation. Based on the research perspective of the influencing factors of logistics service quality, this paper uses the logistic regression model to explore the influencing factors of the punctuality of logistics services and ensures the accuracy of the research results through the stability test.

Objection
Analyzing the punctuality of logistics services in India in view of several contributing factors is the goal of this study. The study examines different influencing factors of logistics services, such as contract type, distance, truck type, geographic location, commodity type, and supplier.
The structure of this paper is as follows. The literature review is covered in the study's first section, which focuses in particular on truck drivers' self-awareness and the variables influencing the caliber of logistics services. The study's next section describes the data processing and research methods. Section three provides detailed data analysis.The discussion and results of the research are covered in the fourth section. In the last part, the study concludes by summarizing its shortcomings and potential research directions.

Data collection
The order status of delivery trucks was considered as the main study group for this study. The data mainly records the data of delivery trucks on the VTS data platform from March 28, 2019, to August 17, 2020. No personal information was collected in this study, and the dataset was obtained from Kaggle. The dataset contains 6880 pieces of data, mainly including location information, punctuality information, supplier information, truck information, and commodity information. The content covers most of India and covers as many common commodity types as possible. Therefore, this dataset can serve as an ideal research background.

Data preprocessing
Make a deep copy of the data set, view the basic data types, and describe the numerical data statistically. The statistical results of the data set are shown in Fig. 1. There are a total of 6880 data, the missing rate is 11%, and there are 4 types of values. Remove highly correlated and useless eigenvalues as required by the regression model. For example, the longitude and latitude of the coordinates and the order of location information affect each other. And columns with uniqueness, such as the ID of the order, etc. Columns with too high missing data also need to be deleted, as shown in Fig. 2 and Fig. 3. For missing values, use the median or mean as necessary, and use 'Unknown' instead for character data. Finally, the data is case converted to ensure the correct classification, and dummy variables are set.

Logistic regression model
Logistic Regression, which is essentially linear regression, maps the result of a linear function to a sigmoid function, as shown in function (1). The logistic regression model directly models the probability of classification, which can effectively avoid the problem of inaccurate distribution of assumptions. It can obtain approximate probability forecasts rather than just "categories" predictions,

MEEA 2022
Volume 34 (2022) 707 which is helpful for many jobs where probability is needed to aid in decision-making; The logistic regression model can utilize several strategies to obtain the best answer and has good mathematical features. The binary logistic regression model is faster to compute and has a much smaller memory footprint than other models.

Results
Based on the model output results are analyzed. It is decided whether the OR value of the variable is statistically significant, or whether the model as a whole is meaningful, based on the thorough evaluation of the model coefficients. The Hosmer and Lemeshow Test evaluates the model's goodness of fit and decides whether the data's information has been sufficiently extracted. Variables in the Equation determine how much different variables affect the outcome.

Omnibus Inspection
Comprehensive assessments of model coefficients are known as omnibus tests. The Model line outputs the results of the likelihood ratio test to determine whether all parameters in the Logistic Regression Model are equal to 0. P < 0.05 indicates that the model fitted this time has an OR value of at least one variable that is statistically significant, indicating that the model is overall meaningful.The specific results and values are shown in Table 1.

Table1. Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 1 Step

Model Summary
To quantitatively evaluate the effect of the goodness of fit, the -2 pair likelihood value is an important index for model evaluation, which is 3098. The smaller the value, the better, and can be used to analyze the modeling effect of the model. The percentage of the dependent variable's variance that the fitted model can explain is shown in the Model Summary table. The table contains Cox & Snell R Square and Nagelkerke R Square, these two R2s are sometimes referred to as pseudo R2s, which have little meaning in logistic regression (unlike in linear regression) and can be ignored [9]. The specific results and values are shown in Table 2.

Hosmer and Lemeshow Test
The Hosmer and Lemeshow test is one of the longest used methods when it comes to assessing the goodness of fit of a model. The specific results and values are shown in Table 3 & 4. It is believed that the information in the present data has been thoroughly retrieved and the model has a high goodness of fit when the P value is not less than the test level (i.e., P>0.05) [9]. The p-value of this study is 0.125>0.05. Through the HL test, there is no need to consider reducing the dimension of variables or increasing the sample size.

Variable analysis
To test the research objectives, the binary logistic regression results are shown below (Table 5-8). In this study, whether the goods were delivered on time or not, the resulting results were treated as dependent variables (0-not delivered on time, 1-delivered on time); distance of transport, type of transport truck, region where the truck is located, supplier ID, the type of goods shipped, the status of the contract with the supplier were taken to be the predictor variables. Table 5 also provides information on the impact of independent variables affecting on-time delivery (see odds ratio [OR]). Distance (b=-0.001, p=0.001, p<0.01), contract type (b=1.958, p=0.044, 0.01<p<0.05), p-values meet the criteria and are statistically significant, contract type (Market/ Regular) has a positive regression slope, indicating that contracted suppliers are more likely to deliver goods on time. The regression slope for distance is negative, indicating that orders with greater distances are more difficult to be on time. The odds ratios of the predictors show that the odds ratio of orders being delivered on time varies by a factor of 0.999 with the increase in the raw score of the overall distance of the order, and by a factor of 7.084 with the increase in the raw score of the contract type.  Table 6 However, the types and suppliers of transported goods are There is no significant effect on the punctuality of commodity shipments in this model. Therefore, the type and supplier of the shipped item are not important predictors of whether the item will be delivered on time, so the type and supplier of the shipped item are not supported.

Model accuracy analysis
The probability of the result event in this case, on-time delivery can be determined for any combination of independent variables after the logistic regression model has been fitted. The regression model will think that the courier has arrived on time (when the probability of the time occurrence is greater than 0.5); if the probability is less than 0.5, it is judged that the event has not occurred (not delivered on time). Therefore, the prediction effect of the logistic regression model can be evaluated compared with the real situation [10].
The logistic regression model in this study was able to correctly classify 88.8% of the observed samples, as expected. This metric is often referred to as percentage accuracy in classification [10].
The specific values are shown in Table 9. 95.6% of cases of late delivery were predicted by the model to be late. 74.3% of on-time situations were correctly predicted by the model to be on-time. Table 9. Classification Table a Observed Predicted OnTime Percentage Correct 0 1 Step

Discussion and Implications
In the research, a fluky model of logistic regression was successfully developed with the main purpose of distinguishing the factors affecting the punctuality of express delivery. This study concluded that contract type, truck type, truck location, transportation distance, supplier, and commodity type were the influencing factors for group differences. In other words, on-time delivery of logistics items was associated with less distance of delivery and have a contract. In addition, the type and location of the truck also had an impact on timing. Some substantial marketing implications can be explored and studied, from result.
Research conducted by Clye-Puaux et al. (2019) shows that truck drivers are reluctant to actively maintain customer relationships, and the more complex the customer conditions, the greater the negative impact [2,11]. This is consistent with the model results, when the distance is farther, the probability of the on-time arrival of the logistics is lower. Akıl & Ungan (2021) believe that the punctuality of orders has a positive relationship with customers' purchasing behavior and loyalty [5]. This is consistent with the findings that orders with cyclical contracts are significantly more punctual than those without contracts. The research of Arlbjørn (2010) shows that the level of economic development in a region will determine the level of logistics services in the region [1,12]. This conclusion is not completely consistent with the research results of the model. The per capita GDP rankings of AP, AS, and UK provinces at the extreme point are 19, 30, and 8 respectively [13]. Economically developed regions have better punctuality of logistics services than underdeveloped regions, but this is not absolute. The experimental results can only prove the existence of Positive influence trend, but only as a reference condition.
Based on these findings, the punctuality of logistics can be improved in the following aspects. Many logistics companies are already improving incentives to boost employee motivation, but it's not enough. Logistics enterprises need to build a more complete management model [14]. It is very necessary to actively change the management mode. In addition to adopting a more reasonable reward and punishment system, it is also necessary to apply the idea of quality management to truck drivers, service personnel, and work processes. Let the service personnel who directly connect with customers think more about customers and take improving service quality as the first goal. In addition, it is also necessary to strengthen the supervision of employees on the quality of logistics services, provide better logistics services, and establish standardization of logistics services. Make the logistics services of enterprises more standardized and rationalized. Logistics service providers can provide service evaluation procedures. For the evaluation content, the logistics company should check it in real time to ensure that the service of each order is supervised. This will greatly reduce the driver's resistance to long-distance orders due to negative emotions, and increase the on-time rate of express delivery services.
With the development of the logistics industry, the competitiveness and service standards of logistics enterprises are constantly increasing. Logistics enterprises should actively use advanced equipment and technology to better meet the requirements of logistics services [6]. It can mainly be reflected in more suitable vehicle configurations, better trucks, and better query services. More efficient selection of truck types and better truck configuration for the company can improve the punctuality of logistics services. Just like Express Mail Service, through the introduction of advanced management technology and modern equipment (mainly reflected in logistics equipment), it has effectively improved the efficiency of logistics services, improved management capabilities and query capabilities. Greatly improved service quality and customer satisfaction [2,15].

Conclusion
By preprocessing the data and testing the model, the overall percentage>60% represents a good fit, and the test value of P>0.05 means that the established model is reasonable for the data. It means that the results reflect the true relationship between the data. Therefore, the model is statistically significant. Based on the analysis of data on logistics services in India, this study found the main influencing factors. The choice of truck drivers and equipment has a great impact on the punctuality of logistics services. Better management models and employee training, as well as better equipment, which can improve the customer satisfaction and quality of logistics services at the same time. Therefore, logistics companies can consider the potential of these factors to improve competitiveness.
The limitations of this study that can be addressed in future studies. Here, the author takes the data of the VTS platform as the target group, which does not reflect the current situation in India as a whole. We should add more data from other platforms or companies to better analyze the factors that affect the punctuality of logistics. Due to a large number of samples and the serious missing of some data, some data have been removed, and the research results cannot represent the real situation and can only be used as a reference. The model predicted the punctuality of logistics and suggested adding other considerations such as time constraints on orders, whether there are delay penalties, driver earnings and bonuses to develop more important features.