Optimizing Number and Locations of Alternative-Fuel Stations Using a Multi-Criteria Approach

The transition to alternative fuels is obligatory due to the finite amount of available fossil fuels and their rising prices. However, the transition cannot be done unless enough infrastructure exists. A very important infrastructure is the fueling station. As establishing alternative-fuel stations is expensive, the problem of finding the optimal number and locations of initial alternative-fuel stations emerges and it is investigated in this paper. A mixed-integer linear programming (MILP) formulation is proposed to minimize the costs using net present value (NPV) technique. The proposed formulation considers the criteria of the two most common models in the literature for such a problem, namely P-median model and flow refueling location model (FRLM). A decision support system is developed for the users to be able to control the parameter values and run different scenarios. For case study purposes, the method is used to find the optimal number and locations of the alternative-fuel stations in the city of Chicago. Some data wrangling techniques are used to overcome the inability of the method to solve very large-scale problems. Keywords-continuous location problem; multi-criteria decision making; alternative-fuel stations; mixed-integer linear programming; optimization; Chicago transportation network


INTRODUCTION
Since it is estimated that oil and gas resources will come to an end by the next 200 years, the use of alternative fuels such as hydrogen [1,2], electricity [3], and natural gas [4] instead of gasoline and diesel is sensible.Hence, there's a growing interest in vehicles powered by alternative fuels.Research has pointed out the significant impact that constructing refueling stations has on the promotion of alternative fuels [5][6][7][8][9].Many researchers and industry representatives addressed the whichcomes-first issue regarding alternative fuel (alt-fuel) stations and vehicles [10].In [11] the which-comes-first question involving refueling stations and vehicles is stated to be like the "chicken-and-egg" dilemma.The research asserts that optimizing the location of refueling stations is inevitable to maximize the potential for consumers to use alt-fuel vehicles, especially in the early steps of transition to an alternative fuel.Whether existence of refueling stations leads to the appeal for alternative fuels or vice versa, it is evident that the establishment of fuel stations and the optimization of their location are crucial.
Optimization techniques have long been used in supply chain design, logistics, and supply chain planning problems [12][13][14][15][16].The use of optimization methods for finding optimal locations especially in energy application area has also been investigated in [17][18][19].Various mathematical models are specifically proposed in the literature for optimally locating fuel stations.Some have addressed the problem with a pmedian model which is a commonly used location-allocation model whose objective is to minimize the total distance traveled from demand nodes to a given number p of facilities by optimally locating the facilities and allocating the demand nodes to them [20,21].It has been used in [22][23][24] to optimally locate alt-fuel stations close to where people live since consumers usually prefer to refuel near their homes, according to empirical studies [3,25].The p-median model was first applied to fuel stations in [26] as an objective in a multiobjective programming model for relocating existing gas stations.Authors in [9] developed the "fuel travel-back" approach whose structure is very much like the p-median model.In this approach, however, nodes are weighted by the quantity of fuel consumed on road segments passing through them instead of population.Moreover, travel time between nodes is considered instead of distance.Vehicle-miles traveled (VMT) data is used in this approach for minimizing total travel time for all the fuel (in gallon-minutes) needed to travel from everywhere to the nearest fueling station.The advantage of this approach is that it only needs road network data and population data that are easily accessible in GIS format from a variety of sources.
There is another approach that tries to locate fuel stations on high-traffic routes.For example, a second criterion that maximizes traffic flows on the roads that have fuel stations is considered in [26] along with the p-median criterion.A similar approach is proposed in [1] considering only high-traffic roads (with at least 20,000 passing vehicles per day).The logic behind this approach is that drivers usually refuel on their way to somewhere else and do not deviate much from their route to refuel.The criteria considered in [27] are a hybrid of the first two types, maximizing the population on covered links.Another approach for locating refueling stations was originally named flow-capturing location model [28] and later termed as the flow-intercepting location model (FILM) [29], that maximizes passing traffic flows without counting the traffic multiple times.Since the flows on paths across a network representing the routes people travel are considered as the demand in these models in substitute for population (pmedian models) and road traffics (traffic-count models), these models are classified as path-based or flow demand models.
Locating facilities in such a way that the number of trips captured or intercepted (i.e.there is a facility anywhere along its path) is maximized is the main criterion in FILM.The FILM is supporting the idea that drivers refuel along their ways to somewhere else rather than making trips for the sole reason of refueling and each flow intercepted is counted only once in the standard FILM, regardless of the number of existing stations in its route.The first problem with FILM is that the model needs a matrix of precise traffic flows from origins to destinations, namely "trip table" data that may not be accessible for all regions and is more complicated than population data.The other problem with this model is that, it considers a path intercepted if it passes by at least one fuel station regardless of the limited driving range of vehicles and the need for multiple times of refueling in longer inter-city trips to complete the path without running out of fuel.Given that electric vehicles and hydrogen-powered vehicles suffer from limited energy storage capabilities, the flow refueling location model (FRLM) was developed [30] to address this critical problem.In the FRLM, a flow is considered refueled or intercepted only if there are stations along the path in such a way that drivers can complete their round trip via the path without running out of fuel, given the limited driving range of their vehicles.The FRLM aims at maximizing the number of trips that can be potentially refueled by a given number of stations and it has been applied to realworld networks in Florida [31] and Arizona [32].Maximizing trip-miles instead of trips [31], considering stations with limited capacities [11,33], and considering locations along arcs [34] are some of the extensions to this model.Another advantage of the FRLM is that it also performs better with the p-median model's objective than the p-median model does with the FRLM's objective.However, the FRLM has two major problems.First, it locates stations far from people's homes.Hence, the long distances may cause some people not to have access to any refueling station, thereby not being able to use alt-fuel vehicles.This reduces the robustness of the locations.Next, the model requires the flow data of all origin-destination pairs considering all possible combinations of stations which can capture a route.Thus, the applicability of the model is disputable.
Another refueling-station-location model is proposed in [35] based on a set covering mixed-integer programming formulation and vehicle routing concept.The proposed technique only requires a matrix of origin-destination data that is usually very accessible.However, the study suffers from not considering flows of traffic as a crucial factor in its model.
Building a true multi-objective model to tradeoff between the flow-refueling and the p-median objectives is proposed as the next logical step to solve the problem in the future work section of many research studies (e.g.[11]).A multi-objective model allows different decision-makers to increase or decrease the weights on these objectives.For example, covering most of the major regions could be more important for governmental decision-makers, while maximizing the profit could be more vital for the decision-makers in the private sector.

II. PROBLEM DEFINITION
Based on what is mentioned above, a multi-criteria model for locating alt-fuel stations in a region considering the pmedian objective while considering traffic flows and construction costs of stations was developed.The model tries to locate stations close to people's homes and close to the traffic flows at the same time while constructing as fewer stations as possible.In all the models mentioned above, the number of stations is considered determined.But, the question is how this number can be determined.Since constructing alt-fuel stations is very costly, covering as much of the consumers' demand for fuel as possible while constructing the minimum number of stations possible is essential.Although several studies have independently addressed the general questions of how many refueling stations are needed and where the optimal locations for stations are, the literature has been mostly silent on how to tradeoff between the location of stations and the sufficient number of hydrogen stations.In this research, a multi-criteria mathematical model is formulated to minimize the total costs of constructing stations and the costs of the fuels used for all commutes with the purpose of refueling using the simple NPV (net present value) technique of engineering economics.This formulation can find the optimal locations while optimizing the number of stations.Moreover, candidate sites for stations in the literature are mostly considered discrete.However, it is very difficult to find all the candidate sites over the whole city in a big city like Tehran.Even by finding all candidate sites, we will have many binary variables which can make the problem harder to be solved optimally in a reasonable time.Furthermore, by considering a few of them as candidate sites, the problem of missing some parts of the solution space may occur.Therefore, the solution space for candidate sites is considered continuous in this paper.This makes the problem similar to a continuous location problem (Weber location problem).For taking into account traffic flows in the model, the most common method in the literature is using all origindestination pairs, but finding all these pairs is very difficult for big cities like Tehran.Also, it is somehow impossible in places where GIS (geographic information system) is not deployed widely.To face the problem, we proposed using flow nodes like population nodes.For example, when there is a route (highway) about 40 kilometers long, it can be divided into two or more nodes with 20 kilometers or less length.Then, some weights proportionate to the traffic flow of the centers of these nodes can be assigned to them.Average daily traffic (ADT) data for cities conveys the same information.Population nodes We can ignore population nodes with very low population and flow nodes with very low traffic flows.

III. MATHEMATICAL FORMULATION
A multi-criteria MILP (mixed-integer linear programming) mathematical formulation is proposed below to solve the problem of finding the optimal number and locations of alt-fuel stations.Some of the characteristics of the problem are listed below: • There can be at most I alt-fuel stations.
• There is a set of J nodes for traffic flows determined beforehand.
• The flow of each flow node is deterministic.
• There is a set of K nodes for population distribution determined beforehand.
• Population of each population node is deterministic.
• Construction of each station costs C1 dollars.
• T equal to 30 years is the time horizon for using newlyconstructed stations.
• There is an inflation rate equal to 15% for alternative fuels' prices and depreciation equal to 10% for all costs.
• Distance metric is assumed to be rectilinear (1-norm).This norm seems more reasonable than the Euclidean distance function (2-norm) since mostly in a network of streets of a city, there is no direct way from somewhere to somewhere else, but rectilinear way.
• Total population TP and total number of cars TC in the region are deterministic.So, the average number of cars per person is TC/TP (TC divided by TP).
• Traveling from an origin to a station for refueling requires consumption of some fuel.The cost of the fuel burned per unit of distance C2 is deterministic.
• rj is considered as the radius of the identified node j.
• The vehicle range is assumed 100km per full refueling.So, each flow will travel for 2*rj kilometers and in each 100km, the vehicle needs refueling, hence 2*rj/100 (2 times rj divided by 100) in the formulation.
• Candidate space for stations is considered continuous.
• Some weights W1, W2 and W3 are assigned to each objective function in order for the decision-maker to control the policy.Sum of the weights had better be equal to one.
A. Notations 1) Indices Equation ( 1) represents the objective function and is comprised of three parts for three different objectives.All three different objectives are transformed to cost functions for being able to be summed up.Constraints (2), ( 3), ( 18) and ( 19) define the values of fdij and pdik which are the minimum of the distances between nodes and stations.Constraints (4), ( 5), ( 6) and ( 22) indicate whether a station must be constructed or not.Constraints ( 7)-( 14) are substitutes for absolute term in rectilinear distance function to make the model linear.

V. CASE STUDY
To examine the capability of our model, we applied it on a real case study in the United States.We considered Chicago, IL as our case study and found the optimal locations for alternative fuel stations in this city.We obtained Chicago's census block population data, which contains 46291 blocks [45].These data include latitudes and longitudes of the vertices of the census blocks along with the population of the blocks.We also obtained average daily traffic count data, which contain 1279 traffic nodes [46].The data for traffic nodes include the latitudes and longitudes of the nodes representing streets and roads along with the average daily number of vehicles passing through those streets and roads.

A. Data Wrangling
Some process has been performed on the data so that they become usable in the formulation.The means of the latitudes and longitudes of the vertices of each census block were calculated and considered as the latitude and longitude of the population node representing the whole block.The locations of the population and traffic nodes were represented in the geographic coordinate system using latitudes and longitudes.The proposed method, however, requires the data to be in the two-dimensional (2D) Euclidean coordinate system.Therefore, we projected our location data to a 2D coordinate system.
where R=6371000 is an approximation of Earth's radius at the equator in meters and C=π/180 is the coefficient for converting degree centigrade to gradian.Since the majority of nodes in this case study could hinder the ability of our proposed method to find the optimal solution in a reasonable amount of time, we used K-Means clustering algorithm to cluster our population nodes and traffic nodes to 20 node clusters.In this algorithm, the total population in each cluster is equal to the aggregate population of the nodes in the cluster and the total traffic in each cluster is equal to the aggregate traffic of the nodes in that cluster.

VI. COMPUTATIONAL RESULTS
We coded the formulation in Python and solved the case study using IBM ILOG CPLEX 12.7 solver.Based on that, we developed a decision support system (DSS) where the user can control objectives' weights W1, W2, and W 3 , cost coefficients C1, C2, and AC, and the maximum number of stations.The DSS visualizes data and optimal solutions in the map of Chicago in html format.The following assumptions were made for the formulation to solve the case study: W1=0.4,W2=0.5 , W3=0.1, C1=1000000$, C2=30$, AC=(TC/TP)=0.125.
In all following figures, population clusters, traffic flow clusters, and optimal locations of alt-fuel stations are shown in green, blue, and red circles, respectively.For maximum number of stations greater than or equal to 4, the DSS decides to open 4 stations as shown in Figure 1.Optimal locations for C2=300$ and maximum number of stations equal to 10 VII.CONCLUSIONS A mathematical modeling approach was proposed in this paper for determining the optimal number and locations of alternative-fuel stations.Net present value technique of engineering economics was used in this multi-criteria model.A decision support system using data wrangling techniques was developed in Python based on the proposed approach.The decision support system uses IBM ILOG CPLEX solver to solve the mixed-integer linear programming formulation.The approach is tested on a case study in Chicago City.
The results highlight the effectiveness of the approach.They also point out the flexibility of the approach based on the decision makers' wills.However, this flexibility comes with a price and that price is the high dependency of the results on the parameter values.As a perspective for future work, uncertainties can be considered in the proposed modeling approach to get more robust solutions to the problem.
www.etasr.comBadri-Koohi et al.: Optimizing Number and Locations of Alternative-Fuel Stations Using … can be determined similarly from census block population data.

Fig. 1 .
Fig. 1.Optimal locations for maximum number of stations>=4The size of population and traffic flow clusters are proportionate to the population and traffic volumes.The results

Fig. 2 .
Fig. 2.Optimal locations for maximum number of stations equal to 1, 2, and 3 from left to right.