Optimized Deep Learning for Enhanced Trade-off in Differentially Private Learning

Privacy and data analytics are two conflicting domains that have gained interest due to the advancements of technology in the big data era. Organizations in sectors such as finance, healthcare, and e-commerce take advantage of the data collected, to help them enable innovative decision making and analysis. What is sidelined is the fact that the collected data have associated private data of the individuals involved, and may be exploited and used for unjustified purposes. Defending privacy and performing useful analytics are two sides of the same coin, and hence achieving a good balance between these is a challenging scenario. This paper proposes an optimized differentially private deep learning mechanism that enhances the trade-off between the conflicting objectives of privacy, accuracy, and performance. The goal of this paper is to provide an optimal solution that gives a quantifiable trade-off between these contradictory objectives. Keywords-privacy; optimization; pareto-optimal; analytics

INTRODUCTION Nowadays, privacy is a tough bargain. With the onset of digitization, online activities are on the rise and so is the increase in the risk of private information disclosure. Online services like shopping, trading, banking, and entertainment are sources of data collection that include sensitive information of the people involved. While not all these services target at the exploitation of personal information, certain applications use them to their benefit. For instance, online browsing information of people is utilized for providing personalized recommendations as a part of the marketing strategy. In healthcare, sensitive information such as disease conditions are used for research and analysis which in turn may be helpful for diagnostic research. Such usage scenarios can be extended to many fields. To safeguard privacy, data transformation methods are employed, which protect sensitive information, while still enabling useful analytics. This is easier said than done, since privacy defense and effective analysis conflict each other. Given the size of big data, the challenge becomes even bigger, rendering it a multi-objective perspective. A privacy preserving analytic system should be able to balance the multiple criteria of privacy-utility-performance. Numerous privacy mechanisms have been proposed that manage privacy preserving analytics for big data. They revolve around privacy algorithms namely k-anonymity [1] and its variants of kanonymity and l-diversity [2]. K-anonymity methods apply data transformation techniques that make the data unidentifiable. In approaches based on k-anonymity, k determines the degree of anonymity and hence the choice of k is an important decision. Optimizing k can pave the way for better privacy, but at the cost of degradation in data utility. Moreover, optimization approaches have the added limitation of performance overhead.
With anonymization, a compromise solution is acceptable, provided preference is given to one of the objectives and this is attributed to the practical setting of the application. More recently, there has been a growing interest in the mathematical foundation provided by the privacy algorithm called differential privacy [3]. The core algorithm is specifically εdifferential privacy, after which relaxed variants have been proposed [4]. Companies such as Google [5] and Apple [5] have used the algorithm for protecting the privacy of their customers, thereby ensuring that they see only a transformed form of the original data. The basic notion of the algorithm is to protect private information of a user, irrespective of the user's participation or non-participation in data analysis. While the algorithm has been strongly recommended for privacy protection [5], only an equally stronger learning mechanism can provide worthwhile analytical results.
II. PAPER CONTRIBUTION Deep learning [6] with differential privacy is a recently emergent domain that has many research applications in the • Optimizing epsilon for trade-off benefit.
• Developing an efficient learning technique to balance tradeoff.
• Extending trade-off beyond privacy and quality of data analysis.
The core idea of the current paper is based on bridging the afore-mentioned research gap in differentially private learning. This paper is targeted at the second and third of the above cases. The main contributions of the paper are: • Firstly, given the strong mathematical background of differential privacy, an optimized deep learning architecture is developed that enhances the efficiency of differentially private learning through effective hyperparameter optimization.
• Secondly, an attempt is made to consider privacy preserving analytics from the performance perspective and the triple trade-off between privacy, utility and performance is enhanced in comparison to the existing techniques.
• Thirdly, a single optimum solution for the problem is identified by adapting a prototypical decision-making strategy which is challenging in any multi-objective problem scenario.
III. BACKGROUND Differential privacy [3,7] is a notable algorithm used in the field of privacy preserving analysis. It was proposed in 2006 [3] and has been proven to provide strong mathematical guarantees for efficiently quantifying privacy. The key element behind the working of the algorithm is that the analysis of any dataset is not affected by the participation or non-participation of an individual. Alternately, it conveys the theory that information about any individual learnt from the data, remains effectively the same before and after analysis. Mathematically defining, the algorithm works by adding noise to data. The addition of random noise distorts data, thereby biasing the outcome of the analysis and preserving privacy. It is based on the probabilistic theory and promises that sensitive information of individuals in data is not affected by its use in any type of study. The most popular use of the algorithm is ε-differential privacy [3] in which epsilon (ε) defines a bound on the privacy loss. It can be used to formally quantify privacy loss and is used as the basis for effective analytics.

• Definition
A randomized algorithm A gives ε-differential privacy if for all data sets D′ and D′′ that have a difference of one instance, and for any S⊆ Range (A), (1) stands [3]: In (1), the datasets D′ and D′′ follow the constraint ||D′ -D′′|| 1 and epsilon is a positive real number, which quantifies privacy loss with change in data.

IV. MATHEMATICAL PROBLEM FORMULATION
The mathematical problem formulation is conceived as a multi-objective optimization problem and will be alternately referred to as multi-attribute or multi-criteria problem in the rest of this paper.

A. Mathematical Modeling of the Proposed Model
Consider M as the vector space of decision variables. Let P(m) and U(m) represent the objective functions to be maximized. Mathematically, the problem can be formulated as: where P(m) and U(m) represent the privacy and utility of the system respectively. In the context of differential privacy, the problem is re-defined as: where Ni is the overall number of instances and c is the number of correctly classified instances.

B. Pareto-Optimality
When resolving a multi-attribute optimization problem, a solution that simultaneously achieves all objectives is not possible, since maximizing/minimizing one objective degrades the other. Consider a set of solutions ܵ ൌ ሼ‫ݒ‬ ଵ ሬሬሬሬԦ, ‫ݒ‬ ଶ ሬሬሬሬԦ, ‫ݒ‬ ଷ ሬሬሬሬԦ, … … ‫ݒ‬ ሬሬሬሬԦሽ to a multi-attribute problem. A solution ‫ݒ‬ ଵ ሬሬሬ ሬԦ ∈ ܸ is said to be better than another solution ‫ݒ‬ ଶ ሬሬሬሬԦ ∈ ܸ if it satisfies the following condition: Such a set S is called a Pareto-optimal set [8], in which all solutions are possible candidates for becoming an optimal solution, but one cannot dominate the other, without degrading one of the objectives involved. Pareto-optimality is a common occurrence in multi-objective optimization problems. Figure 2 shows the general occurrence of Pareto-optimality in a twodimensional Euclidean space.

V. DESIGN METHODOLOGY
This section describes the scheme of the proposed system. A differential privacy-based optimized deep learning neural network architecture for implementing privacy preserving learning is suggested. This approach addresses the twin challenges of privacy loss and efficiency of the learning technique. Deep learning frameworks for differential privacy are gaining importance for privacy preserving analytics. An appropriate blend of these techniques will be able to provide robust solutions for the problem of privacy preserving learning, primarily due to two reasons. Firstly, differential privacy is an algorithm that can provide strict data privacy guarantee. Thus, analysis on private data must incorporate complex learning architectures whose outcome can quantitatively substantiate data privacy guarantee. Hence this methodology develops a Bayesian optimized deep neural network architecture for private learning with accounting of epsilon that determines the privacy assurance offered by the model.

A. Overview of System Architecture
A deep neural network is trained on differentially private data, and the intended learning task is classification analysis. The model topology initiates with an embedding layer to handle the categorical parameters of the input data. The transformed input is reshaped, and layers are concatenated before passing to the dense layer. The topology then alternates between dense and normalization layers. Each dense layer has 1000 units. Batch normalization is carried out to stabilize the output structure. This architecture is considered as the standard learner. In the standard learner, these neural net parameters are chosen arbitrarily and they are fixed as the reference model against which the optimized variant will be compared. Since these parameters determine the strength of the learning process, a good combination of the parameters provides significant performance improvement. Specifically, in deep learning parlance, these parameters are known as hyperparameters [7]. Hyperparameter values have a deciding role in the learning process. Selecting optimal values for the hyperparameters is called hyperparameter optimization [7,9]. Fine-tuning of the hyperparameters of the model can allow the privacy-utilityperformance trade-off in a quantitatively principled fashion. In this work, the proposed architecture with tuned hyperparameters is referred to as DPBODL (Differentially Private Bayesian Optimized Deep Learner). The design of the system is shown in Figure 3.

B. Hyperparameter Optimization
Methods such as Grid Search and Random Search [7], search the entire space of available parameters to arrive at optimal values. These methods are costly and training a model using them is challenging. The use of genetic algorithms [10] for efficient optimization has also been reported. Bayesian optimization [7] reduces the time required for the parameter search which in turn limits the model training time, because only a selected set of parameters are chosen for a subsequent iteration, resulting in a search space which has the local optimal values from the previous evaluations. In this way, the method can efficiently select the global optimal hyperparameter set when the search terminates. Although the hyperparameterized approach for non-private models has been used [11,12], experimenting its efficiency for private deep learning models has far-reaching research possibilities. The proposed DPBODL exploits this possibility.

C. Steps for Optimizing the Deep Learner
Let H q be the initial hyperparameter search space and Acc the classification accuracy. Let ߠ ேே ൫ܺ, ‫ܪ‬ ൯ be the objective function of the Differentially Private Deep Neural Network (DPDNN) with X as the input space. If function values follow a Gaussian distribution, the optimization function is defined as: Step 3: Calculate the maximum a posteriori hypothesis of the set of hyperparameters ‫ܪ|ܿܿܣ‪ܲ൫‬‬ ௧ ൯ . The optimizer selects the hyperparameters that maximize the accuracy of the deep learner.
VI. EXPERIMENTAL SETUP The deep learning model is trained and hyperparametrized using Tensorflow core 2.0 [13]. For the purpose of experimental evaluation, two different types of tabular data varying in size and dimension have been selected. The adult and diabetes datasets adapted from the UCI data repository [14] www.etasr.com Geetha et al.: Optimized Deep Learning for Enhanced Trade-off in Differentially Private Learning are used. Adult data are chosen since they are the de-facto benchmark for privacy related studies. The dataset provides 14 inputs that are a combination of categorical, numerical, and ordinal types. The target variable requires classification of data into two salaried classes, those belonging to "less than 50k" group and another belonging to the "greater than 50k" group. Diabetes dataset has 20 attributes containing categorical and numerical data with around 200,000 instances with presence /absence of disease condition as the target. For both the datasets, the optimization ranges of the different hyperparameters considered are shown in Table I.

VII. EVALUATION
In this section, the performance of DPBODL is compared with the standard deep learner's. Epsilon, accuracy, and execution time were noted for both learners. The tabulation (Table II and Table III) shows the values of epsilon, accuracy, and execution time for adult and diabetes datasets respectively. The graphs in Figures 3 and 4 show the comparison of results between the standard learner and DPBODL for the two datasets. Maximizing the privacy and the utility of the analysis in the current context involves minimizing epsilon and maximizing the classification accuracy. The standard learner's performance for the two datasets is shown in Figure 3. Epsilon degrades with increasing accuracy and execution time ranges up to 600s. The DPBODL's performance is shown in Figure 4. The DPBODL achieves enhanced trade-off in comparison with the standard learner with reference to epsilon and accuracy. While DPBODL gives epsilon values ranging between <0.78, 5.34> for adult and <0.87, 8.34> for diabetes, it is apparent that the standard model's learning initiates with larger values for epsilon in the range <5, 11>. Similarly, the enhanced learning speed of the optimized learner is due to the optimization of the number of epochs for the training of the model. It can be observed that, on average, 3-6 epochs are required to give an accuracy of approximately 80% .On the other hand, the standard learner requires at least 10 epochs to give a starting accuracy of 80%. Essentially, DPBODL's enhanced performance is attributed to efficient hyperparameter optimization.
Privacy-utility performance analysis-standard learner.  VIII. COMPARATIVE ANALYSIS This section compares the proposed DPBODL mechanism with the state-of-the-art techniques for privacy preserving analytics. For the purpose of comparative analysis, three different techniques have been chosen, namely Bayesian Optimized Diff Private Pareto (BO-Dpareto) [9], Privacy Preserving Deep Learning (PPDL) [10], and Linear Regression-Diff Private Convex Optimization (LR-DPCO) [15]. Both PPDL and BO-Dparteo have been chosen for the comparative study for two reasons. Primarily, these approaches use deep learning for privacy preserving learning. Secondly, tabular data for analysis are used, in comparison to many other approaches, which predominantly use image data. LR-DPCO uses shallow learning [16], but the algorithm's results are equivalent to many deep learning approaches and including them here ensures a fair comparison. The graph in Figure 5 shows the comparative performance between the present prevailing techniques and DPBODL. PPDL is able to minimize epsilon, but at the cost of decline in classification accuracy. For epsilon values in the range <10 -2 , 10 -1 >, PPDL achieves about 63% classification accuracy. As epsilon is compromised (larger values of epsilon), accuracy improves. BO-Dpareto and LR-DPCO have classification accuracies of 75-80% for epsilon range <10 -2 , 10>. Minimizing privacy as low as 10 -2 in the existing techniques causes a corresponding reduction in the accuracy of the analysis. Therefore, when considering the epsilon-accuracy compromise, DPBODL approach balances it efficiently, without severely affecting analysis accuracy, and hence is achievable in a practical scenario. It can be argued that the technique gives a fair compromise between the two objectives. Lower epsilon ranges give accuracy of about 80% for both datasets, and it can be noticed that analysis accuracy stabilizes thereafter. For values of epsilon greater than 1, a higher accuracy (85-90%) is achieved with DPBODL, whereas the known state-of-the-art approaches achieve an average accuracy of only 85%. Hence the hyperparameter optimization of the deep learner has resulted in achieving a good balance between privacy and utility. As far as computational efficiency is concerned a straight comparison between DPBODL and the afore-mentioned algorithms is unreasonable, since they differ by factors such as data size, dimension, and dynamics. Hence its computational efficiency is compared with the standard deep learner. DPBODL performs well in comparison to the standard learner as shown in Figure 6. The comparison shows that the number of epochs required to train a standard learner is much higher than its optimized variant. So, it can be perceived from the comparative analysis that the proposed approach has been able to provide a reasonable performance trade-off. Fig. 6.
Performance comparison.
The next section discusses the achieved privacy-utility trade-off in detail by considering the Pareto-front generated by the approach and the extent to which the approach generates optimal solutions.

IX. STATISTICAL ESTIMATION OF OPTIMAL PRIVACY-UTILITY TRADE-OFF
This section makes a statistical assessment of DPBODL's results and computes an optimal privacy-utility trade-off. While selecting an optimal solution, the execution efficiency of the model is considered independent from these primary objectives, but the model does not overlook computational efficiency in the process of identifying a compromising solution. From the objective space consisting of a set of optimal solutions, as shown by the Pareto-front in Figure 7, an optimal solution is identified. The statistical analysis is carried out for the adult dataset. The utopian method [8] is employed to determine this optimum point. It is a decision-making strategy, which involves determining the near optimal <epsilon, accuracy> pair by comparing all data points to the ideal point called utopian point.
Let the conflicting objective functions in the current setting be defined as P(m) and U(m) (defined in Section IV) .While P(m) determines privacy measured by epsilon, U(m) indicates the utility measured by the accuracy of classification analysis. The objective is to maximize both privacy and utility. Maximizing privacy in the context of differential privacy involves achieving smaller values of epsilon, while maximizing accuracy of analysis. The objective space is shown as a scatter plot in Figure 7. The plane shows the relationship between data points. The utopian point is positioned at (0.783, 89) indicating the ideal values for epsilon and classification accuracy. After estimating the distance measures of all points from the utopian point, the closest one is chosen as the final optimal solution, justifying the ideal trade-off between the objectives of the problem. The results of this calculation are shown in Table IV. In this problem setting, a point with an epsilon value of 0.783 with corresponding classification accuracy of 84% is found to be the closest. Table V shows the comparison in trade-off between the existing and the proposed technique. The {epsilon, accuracy} pair shows enhanced trade-off in comparison to the existing techniques. Computational efficiency has been experimented only by PPDL, and results show that DPBODL is executed in a reduced number of epochs in comparison with PPDL for a corresponding epsilon value.  X. CONCLUSIONS In this paper, an optimization of deep learning technique for differentially private learning is proposed. Conceptualizing private learning as a multi-objective optimization problem, the proposed method aims to find an enhanced privacy-utilityperformance trade-off for private learning. Although it is challenging to find a single optimal solution, that is mathematically best, for a multi-objective problem, the proposed method substantiates this trade-off by employing an appropriate decision-making approach. Firstly, various tradeoff solutions are generated with the optimized learner. For a decision to be made with reference to the optimum point, Pareto-optimal decision-making is done. The results show that the trade-off achieved is a quantifiable enhancement over the existing techniques. The proposed method has also considered execution efficiency which was not experimented by many of the existing techniques.