R-SOR: Ranked Social-based Routing Protocol in Opportunistic Mobile Social Networks

Exploiting social information to improve routing performance is an increasing trend in Opportunistic Mobile Social Networks (OMSNs). Selecting the next message’s relay node based on the user’s social behavior is a critical factor in attaining a high delivery rate. So, to ascertain the most efficient selection of the next relay, the correlation between daily social activities and the social characteristics in the user profiles can be exploited. In this paper, we consider the impact of the social characteristics on mobile user activities during certain periods of the day and then rank these characteristics based on their relative importance in order to be included in the routing protocol. These processes consolidate the proposed Ranked Social-based Routing (R-SOR) protocol to provide an effective way for data dissemination in OMSN. We use the real data set INFOCOM06 to evaluate the proposed protocol. The experimental results show that the proposed protocol has higher routing efficiency than flooding-based protocols such as Epsoc and Epidemic, prediction-based protocols such as PRoPHET, and social-based protocols such as MSM and Bubble Rap. Keywords-opportunistic networks; mobile social networks; data dissemination; social-based routing

INTRODUCTION Nowadays, smart mobile devices are present in all areas of human activity. They transform the ways of sharing data into a new paradigm of Mobile Social Networks (MSNs) [1][2][3][4]. MSNs combine the social features of users carrying mobile devices in the strategy for data delivery. In an Opportunistic Mobile Social Network (OMSN), the mobile nodes communicate opportunistically and data forwarding occurs when they encounter each other. In other words, routes from the sender to the destination of a message are created dynamically, and any possible node can opportunistically be used for the next hop. For these reasons, data routing in such networks is a crucial challenge.
The Store-Carry-Forward (SCF) method was developed as a data forwarding mechanism in opportunistic networks. With the SCF mechanism, each node stores data packets in the buffer. When the node encounters another node, it forwards the duplicated data packets. As a result, network resources such as bandwidth and node packet buffers are consumed in large amounts. For this reason, selecting the appropriate relay is a major challenge and key issue in OMSNs. In [5], the performance of different routing protocols in opportunistic networks is analyzed. For instance, epidemic routing protocols [6,7] use a flooding-based principle for spreading copies of messages to newly discovered contacts. Prediction-based routing protocols, such as PRoPHET [8,9], use contact history between users to estimate nodes' delivery probability which characterizes the probability of successfully delivering a message to the destination from a local node. Social-based routing is an emerging research trend where researchers look for ways to use social information of individuals to assist routing as it has been found that people follow almost stable patterns in their social behavior [10].
Routing in mobile networks is a challenging issue [11], networks such as the Vehicular Ad-hoc Network (VANET) and the Wireless Sensor Network (WSN) require efficient routing protocols due to their dynamic topology [12,13]. Exploiting long-term social features in a disruptive and varied network environment for routing is a fruitful approach [14][15][16]. Several social aware routing protocols have been proposed [17][18][19]. Social aware schemes exploit different social metrics such as centrality [20,21], similarity [22], betweenness [23], and social ties and friendship [24]. A user's social profile contains a vector of several social features such as nationality, language, and topics of interest. These social features influence the social behavior and activity of the mobile user. For examples, persons of the same nationality contact each other more often, people having common interests and contacts do similar actions, individuals with common language tend to meet and talk with each other longer, etc. In addition, most people follow a stable pattern or routine in their daily life. Generally, humans have different social activities at different periods of the day. The influence of social features changes according to the day period. For example, most people who meet in the evening share the same nationality or similar affiliations.
The main objective of this work is to design an efficient forwarding scheme in order to achieve high delivery ratio and to decrease the network overhead. This is carried out by involving the ranking of social features with regard to a specific day slicing mechanism in the routing process. Our contribution is focused on designing a ranked social-based routing protocol which is referred to as R-SOR protocol that seeks to improve the performance of OMSNs by considering the regularity of users' social behavior and by exploiting the relative impact of the social features during each day period. The contributions made under the proposed scheme are: • We define a set of social profiles which can be involved in the routing process of OMSN.
• We estimate the probabilistic behavior of social activities according to the dynamic changes over time with regard to the defined day periods. An algorithm for updating the nodes' periods is proposed for this purpose.
• An algorithm for ranking social features according to their estimated probabilities is proposed.
• We developed the proposed R-SOR algorithm by selecting the best relay node having the highest rank of social feature similar to that of the destination in the current day period.
In order to investigate the proposed R-SOR approach, we carried out experiments using the real data set INFOCOM06 [25] to explore the impact of different social features in different time periods. The result of the simulation experiments exhibits the efficiency of the R-SOR protocol, in comparison to 5 benchmark routing protocols, namely flooding-based protocols such as Epsoc and Epidemic, prediction-based protocols such as PRoPHET, and social-based protocols such as MSM and Bubble Rap.

II. RELATED WORKS
Generally, routing schemes in OMSNs can be classified into different categories such as flooding-based, mobility prediction-based, context aware, and social aware protocols.
The stability of human social preferences and the regular patterns that social behavior follows give the opportunity to develop information provision. Several social features have been exploited by the research community [24]. For example, the SimBet routing protocol utilizes two social features, similarity and betweenness, for the forwarding process [26]. The values of betweenness centrality and similarity for each node are estimated with the ego network analysis technique. Only local available information is considered. Messages are forwarded towards the node with higher betweenness centrality to increase the possibility of finding the potential carrier to the final destination. Concentrating load on centralized nodes in SimBet results in network congestion in theses nodes. In [27], the authors propose the Bubble Rap protocol which exploits two social and structural metrics, centrality and community. In this scheme, nodes belong to different sized communities and have different levels of centrality which was denoted as rank. There are two ranking types for each node, the local ranking inside the node's community and the global ranking for the entire network. Both levels of ranking are used for forwarding data. Messages are forwarded to nodes having higher global ranking until a node in the destination's community is encountered. Then, the messages are forwarded to nodes having a higher local ranking within the destination's community. However, employing the community detection methods require network information collecting and results to exchanges which lead to an overhead increase in the networks.
The research in the utilization of multiple social features to develop effective social aware routing is gaining interest. The authors in [28], introduced the intelligent tree (Int-Tree) routing protocol which exploits social ties, community density, and similarity of interests to improve routing performance. Intercommunity and intra-community communications were considered. For intra-community communications, the social ties between nodes are utilized to select the next forwarders. Authors in [29] exploited both offline and online user's social network information to propose a social-based forwarding strategy for opportunistic networks. They combined the dynamic online centrality and the detected centrality based on the contact history. Thereupon, the delivery ratio was increased and the message replication number was reduced.
Authors in [30] proposed the Multi-Layer Social network based Opportunistic Routing (ML-SOR) protocol that considered the dynamics in social structure and user behavior by exploiting 3 social features, namely node centrality, community structure, and the social ties. Offline temporal social information gained from nodes' encountering and online static social information acquired from social networks were utilized for the better selection of the next carrier. Authors in [31] proposed the Proximity-Interest-Social (PIS) routing protocol that considered 3 social factors, physical proximity, user interests, and user social relationships. These factors were used along with the slot time management mechanism and a copy control way to develop an effective forwarding scheme. Authors in [32] proposed a multiple social metrics-based routing protocol where 3 social metrics were exploited, centrality, social similarity, and social activeness. Forwarding decisions were made based on combinations of the impact of these three metrics and the correlation among them. Authors in [33] proposed a social aware routing scheme which depends on the features of the opportunistic network. In this scheme, dependability ratio, usability ratio, and weight factor are calculated as weights of human activities to obtain the optimal cooperation nodes in the network. Authors in [34] proposed the fuzzy routing-forwarding algorithm (FCNS) which exploits comprehensive node similarity (social and mobile similarities) in opportunistic social networks. Information about the state of the network is collected and updated to evaluate the social and mobile similarities of the nodes in the network, and then the transmission preference for each node is calculated through the fuzzy evaluation. The forwarding strategy depends on comparing the transmission preference of the nodes where the node with higher transmission preference will be selected as a message relay. In addition, FCNS uses a feedback mechanism for more stable forwarding of the message in the network. The results showed good achievement in terms of delivery ratio and routing overhead.
Authors in [35] considered the context of social importance when exploiting social similarity to improve routing performance in mobile opportunistic networks. Context information are recorded in nodes' buffer and a relay node is selected if it has the most similar social behavior to the destination. Authors in [19] proposed a hybrid routing scheme to enhance the routing efficiency of the epidemic protocol by exploiting social properties. Degree centrality metric was used to adjust the Time To Live (TTL) value of the forwarded messages in the seeking of decrease the redundancy of Epidemic protocol. Thereupon, EpSoc forwards messages towards the nodes that have higher degree centrality values.
In contrast to these works where a limited number of social features and social activity patterns were considered in the implementation of the routing protocols, the proposed approach allows us to exploit all social characteristics available in a user's social profile. The proposed approach considers the mutual correlation between the importance of a social feature and the node's interactions in the network. As long as the impact of different social features and social activity patterns over time is an important issue we also rank them according to user's social activity during different periods of the day. For each time period, we recorded and updated node contacts and ranked the social features based on their impact. The details of the proposed protocol and the related algorithms are presented below.

III. THE R-SOR PROTOCOL
The R-SOR protocol is a social-based routing protocol that ranks social characteristics and exploits them for efficient message forwarding in OMSNs. Several social characteristics such as nationality, language, interest topics, membership, and hobbies can be stored in a user's social profile. Social behavior and activities of the people are a reflection of their social characteristics during their daily routine. For example, researchers with similar interesting research topics encounter each other and make a discussion for a long time during meeting activities, students with common hobbies tend to form groups and interconnect regularly in the campus, etc. These simple examples indicate that social features have an impact on the individuals' social behavior during their daily routine. In Table I, the notations and definitions that will be use throughout the R-SOR protocol algorithms are summarized. Social feature contact weight of a given social feature ݂ ௫ in the node i at the day period p.
Set of values of a social feature ݂ ௫ ∈ ‫ܨ‬ in the node j at the day time period p.

݂ ௫
Highest ranked social feature of destination in the current day time period.

A. Updating Nodes' Contacts Per Day Time Period
People have daily routines, they follow almost stable patterns of social behavior in their everyday life. To consider the development of the people's social behavior during the different time periods of their everyday life, we propose the slicing of the day into 6 time periods, each one 4 hours long, as depicted in Figure 1. This division in different time periods has been reported in [33,36] and evolves social structures that reflect the different interactions that users have over different daytime periods. Each node records and updates the history of the nodes it encounters per time period, while roaming on the network. These contacts' information set will be exploited to rank the social features. Let ‫ܥ‬ be the set of contact events during the day time period p, p=1,…, 6. A contact event c within a time period is characterized by the couple of contact nodes i and j and is denoted by ݁݊ܿሺ݅, ݆, ܿሻ. The number of contacts of a node i with a node j during a time period p is calculated based on the following equation: where ݁݊ܿሺ݅, ݆, ܿሻ = 1 if there is a direct connection between i and j at the contact event c, otherwise ݁݊ܿሺ݅, ݆, ܿሻ = 0.
Hence, the total number of contacts of node i in the time period p is calculated as: where ܰ denotes the set of nodes encountering the node i within the time period p.
Algorithm 1, shown in Figure 2 is used to update node contacts for each day time period.

B. Ranking Social Features
Mobile nodes have several social features stored in the user's profile. Each social feature available in the user's profile may have a single value, like the country feature, or multiple values such as the nationality and interesting topics. For every time period, the contacts with other nodes are recorded in the period contacts profile. Accordingly, the social features of each node are ranked according to the number of contacts with other nodes that are associated with the same social features. Let ‫ܨ‬ = ൛݂ ଵ , . . . , ݂ ಷ ൟ be the set of social features which can be associated to the nodes in the network, where ݊ ி is the number of social features included in the ranking process, and let ܸ ೣ , be the set of values that a social feature݂ ௫ can take at the node i, ‫ݔ‬ ൌ 1, . . . , ݊ ி . For a given node i and a time period p, the number of contacts with other nodes that share the same social feature ݂ ௫ is counted and is denoted by ‫ܥ‬ , ೣ . This is calculated using: where ‫‬ ೣ ሺ݅, ݆ሻ ൌ 1 if ܸ ೣ , ∩ ܸ ೣ , ് ∅ . In other words, the number of contacts of the node i with regard to the social feature ݂ ௫ is incremented by 1 when a node j encounters the node i if and only if at least 1 value of the concerned social feature is similar in both nodes. Accordingly, the weight of contacts associated with the social feature ݂ ௫ of node i at time period p is calculated by: It is worth noting that ‫ݓ‬ , ೣ is a measure of significance of a given social feature ݂ ௫ and is referred to as social feature contact (SFC) weight. The main idea behind this work is to involve the values of the SFC weights in the ranking process of the social features which is ultimately used as input in the forwarding decision. To this end, the SFC weights of each node in the network are calculated and stored in ascending order in the node's profile. The social feature that has the highest SFC weight value is the top-ranked social feature and is considered as the most impactful on the message forwarding process. Algorithm 2 ( Figure 3) is used to record and update the ranked vector of the social features.

C. R-SOR Forwarding Strategy
The R-SOR protocol includes two steps before starting the message forwarding process. The first step is devoted to reap the patterns of the mobile user's social behavior in the corresponding day time period while the second step is to rank the SFC weights that will be involved in the forwarding decision. Based on the aforementioned description, every node ranks its social features per time period using the Algorithm 2. Algorithm 3 describes the proposed R-SOR algorithm for message forwarding. Let node i be the current node. If a node j encounters the node i, then at first, node j will be checked for the availability of the destination's top-ranked social feature in its social profile. If available, then the number of contacts regarding this feature in node j will be compared to that in node i. If the number of concerned contacts of node j is greater than that of node i, then the message will be forward to node j else node i will still carrying the message. A message forwarding will also occur if the destination's top-ranked social feature is available in node j but not available in node i. Forwarding process in SOR. Figure 4 shows an example of the message forwarding process. Let S and D be the sender and destination pair nodes respectively, and let N1, N2, and N3 be the intermediate nodes which are candidates for the next forwarding process. Assume that in the current day time period, the top-ranked social feature for node D is the "language". Node S will select from the intermediate nodes (N1, N2, and N3) that have common languages, in this case nodes N1 and N3, will be selected. Node S will further choose the node that has the highest rank for language feature, in this example N3. The forwarding algorithm is shown in Figure 5 (Algorithm 3).

A. Data Set
The dataset INFOCOM06 [25] was used for the simulations. This dataset is suitable for this study because of the viability of social characteristics of the mobile users. The participants were asked for filling answers in a questionnaire about different social features. We obtained this social information and attached it as social profiles for the mobile users in our experiments. We exploited the 10 available social characteristics, namely nationality, studies, languages, affiliation position, city, country, stay, member and topics. In this dataset, 78 mobile iMotes have a wireless range of around 30 meters using Bluetooth technology for communication. The encountering nodes within the short range of the 78 nodes are saved and tracked.

B. Simulation Setup
The Opportunistic Network Environment (ONE) [37] was used to conduct the experiments and to evaluate the proposed protocol. ONE is a Java-based simulation program which is designed primarily for opportunistic networks. The performance of the proposed R-SOR protocol is compared with that of 5 benchmark routing protocols, namely Epidemic and EpSoc which are flooding-based routing protocols, PRoPHET which is a prediction-based routing protocol and Bubble Rap and MSM which are social-based routing protocols. The simulation settings are summarized in Table II.  In each experiment, we evaluate the performance of considered protocols based on the following metrics: • Successful delivery ratio: it is the ratio between the number of delivered messages and the total number of created messages. The ideal value of the successful delivery ratio is 1.0. This is attained when all created messages are delivered to their destinations.
• Overhead ratio: It reflects how many redundant messages are transmitted to deliver one message. It simply reflects the transmission cost in a network.
• Average latency: it is the average of the time elapsed between message creation and delivery.
• Average hop count: it is the average number of hops that messages traverse to reach the destination.

C. Experiments and Results
To evaluate the performance of the R-SOR protocol and provide a comparison against the performances of the other protocols, message TTL and buffer size values are varied in the carried experiments. It is worth noting that TTL and buffer size have a crucial impact on the routing performance in OMSNs because of the opportunistic behavior of the mobile node communication and the use of the SCF mechanism for message passing. Figures 6-9 show the performance of the proposed R-SOR protocol compared to that of the other routing protocols in terms of delivery ratio, overhead ratio, average latency, and average hop count. The experiments were carried out for different TTL values around the values given in Table II (from 10m until 1.6 days). Figure 6 shows the delivery ratio performances of the different protocols with varying TTL. The Figure shows that for low-range TTL values (10m-1h), the delivery ratio performance of all the routing protocols are low, due to the fact that the TTL value of a dropped message is set to 0. Increasing the TTL value causes rising up the delivery ratio (from TTL value of 8h, we observe a certain saturation), due to the delivering of quite a lot of messages to the destinations. For higher TTL values (over 16h), the delivery ratio decreases slightly with further TTL increase. This is the result of the increased number of carried and ultimately dropped messages in the networks which affects negatively the delivery ratio. The Figure reveals that the R-SOR protocol outperforms all other protocols for TTL values higher than 8h. Hence, messages with high TTL value are more probably carried by nodes that have strong social relations enabling R-SOR to achieve higher delivery ratio. The R-SOR protocol considers the regular patterns of the users' social activity during their everyday life and carries out a ranking of the social characteristics based on the encountering rate in each day time period. This finding strongly implies that the adopted raking strategy leads to the selection of the relay node that has higher probability to encounter the destination.  Overhead ratio over TTL change. Figure 7 presents the relationship between the overhead ratio and TTL value. It can be seen that the R-SOR protocol outperforms the Epidemic, PRoPHET and Bubble Rap protocols and its superiority is obviously manifested at high TTL values (higher than 1h). In these scenarios, messages do not expire quickly, which enables the R-SOR protocol to update the contact profiles of the nodes and ultimately to rank the social characteristics in each day time period. The R-SOR protocol considers all social characteristics in the user's profile and ranks them according to the social activities of the mobile nodes for each period. As a result, nodes with strong social relationships with the destination node are chosen as relays. This reduces the number of intermediate nodes involved in the forwarding process and increases the likelihood of delivering the message, and consequently, the overhead ratio is decreased substantially. In addition, Figure 7 shows that the overhead ratio of the EpSoc and MSM protocols is lower than that of the R-SOR protocol. This can be explained by the fact that these protocols apply rigorously defined rules for selecting the next forwarder. Nonetheless, negative impact on the delivery ratio is pronounced as outlined in Figure 6.   Figure 8 shows that the higher the TTL-value, the higher the average latency for all protocols. It is worth noting that raising the TTL gives the messages the opportunity to traverse longer paths to reach their destinations. Hence, the higher the TTL value, the higher the average end-to-end delay. We see that the R-SOR protocol outperforms the others in terms of average latency, with the exceptions of Epidemic and EpSoc which are the most prominent protocols regarding this evaluation metric because of their flooding-based forwarding strategy. The Figure reveals that choosing the appropriate relay based on the social feature ranking increases the likelihood to encounter the destination, so messages are delivered in less time, and thus the effectiveness of the R-SOR protocol in terms of the average latency metric is confirmed. The performance evaluation of the protocols regarding the average hop count is shown in Figure 9. As the TTL increases, the hop counts increase for all protocols until they reach stable levels. The Epidemic protocol performance is affected by the TTL variation more than the other protocols because of its floodingbased forwarding strategy. The Figure shows that the R-SOR protocol outperforms the non-social-based protocols (Epidemic and ProPHET). The hop counts are reduced in R-SOR because exploiting social features limit the number of candidate relay nodes, as a relay node is only chosen if it has a similar value to the destination's top-ranked social feature. Comparing to social-based protocols (Bubble Rap, EpSoc, and MSM) the R-SOR protocol has a similar-to-slightly better performance Fig. 9.
Average hop count over TTL change.
In the following experiments the performance metrics are evaluated by varying the buffer size while the TTL value is set to a constant value (10 hours). This value is selected based on the evaluation of the previous experiments. All protocols have good performance for this TTL value. The message size is set to 128KB. Three levels of buffer size will be considered in this study, namely low level (1, 2, and 5MB), middle level (15,25,and 35MB), and high level (45 and 55MB). It is worth noting that increasing the buffer size means increasing the capacity of messages a mobile node can carry. Figure 10 shows the performance of the protocols in terms of delivery ratio with varying buffer size. It is obvious that increasing the buffer size leads to increased delivery ratios for all routing protocols. A higher buffer size enables the mobile nodes to store and carry more messages and ultimately to deliver them to the destinations. The Figure illustrates that the R-SOR protocol outperforms the other protocols with respect to the delivery ratio for all buffer size levels. Moreover, the Figure shows that the superiority of the R-SOR protocol over the others is better exhibited at low and middle buffer size levels. This can be explained by the fact that the R-SOR protocol manages the buffer size more efficiently than the other protocols. The R-SOR involves all the social characteristics of the mobile users and adopts the ranking algorithm to select the next relay node. This leads to forwarding messages only to nodes that provide higher likelihood to encounter the destination. Consequently, the R-SOR protocol will be an effective forwarding mechanism especially in resourceconstrained opportunistic networks.  Figure 11 outlines the overhead ratio with varying buffer size value for all protocols. It is obvious that increasing the buffer size results in decreasing the overhead ratio in the OMSN as intermediate nodes are enabled to carry more messages for later delivery. The R-SOR protocol shows a low overhead ratio in the OMSN. It outperforms Epidemic, PRoPHET, and Bubble Rap protocols, especially for low buffer size levels. Moreover, the Figure shows that the R-SOR protocol slightly outperforms the EpSoc and MSM protocols due to the fact that they apply strict social-based forwarding decision rules. Nonetheless, the delivery ratio for these protocols is lower than that of the proposed R-SOR protocol as shown in Figure 10.   Figure 12 shows the performance of the routing protocols in terms of average latency with varying buffer size. It can be seen that increasing the buffer size results to an increase in the average latency. Consequently, the messages will have a larger end-to-end delay to reach the destinations. For low buffer size levels, it is observed that the R-SOR protocol has lower average latency than ProPHET, Bubble Rap, and MSM protocols while the Epidemic and EpSoc protocols outperform all protocols due to their flooding-based forwarding strategy.  Exploiting users' social characteristics and ranking them based on the encountering rate in each time period enable the R-SOR to select the forwarder node that has the higher likelihood to deliver the message in a lower end-to-end delay. Increasing the buffer size above 25MB, R-SOR, PRoPHET, and Bubble Rap protocols achieve almost similar performance. They benefit from the ability to store more messages in the mobile nodes and they outperform Epidemic, EpSoc, and MSM. The average hop count evaluation results are shown in Figure 13. The Figure reveals that the R-SOR protocol achieves low hop count and outperforms the Epidemic and PRoPHET protocols due to the fact that only nodes that have similarity with the top ranked social feature of the destination will be selected for message forwarding. Hence, a low number of nodes will corporate in delivering the messages in the OMSN and hence a low average hop count is attained. The achievement of the R-SOR protocol in terms of average hop count is close to that of the Bubble Rap and EpSoc protocols and is slightly higher than that of the MSM protocol. This is because the MSM protocol applies stricter rules in forwarding messages than the R-SOR protocol as it exploits multiple social metrics for relay selection which leads to the decrease of the intermediate nodes. However, as we mentioned before and as shown in Figure 10, this impacts negatively the performance in terms of delivery ratio, where the R-SOR protocol outperforms the MSM protocol.
V. CONCLUSION In this paper, the R-SOR protocol is proposed, which aims to enhance the routing efficiency in OMSNs. The R-SOR protocol ranks the social properties of the mobile users in an OMSN during day time periods, and exploits the impact of social features in order to develop a more efficient forwarding scheme. The empirical results using real data show that the R-SOR outperforms other benchmark protocols, namely floodingbased protocols such as Epsoc and Epidemic, prediction-based protocols such as PRoPHET, and social-based protocols such as MSM and Bubble Rap. Different evaluation metrics were used for performance evaluation and comparison, namely delivery ratio, routing overhead, average latency, and average hop count. Regarding future work, we plan to combine the ranking of multiple social metrics and to make use of the social characteristics of mobile users and Internet of Things (IoT) objects for efficient data dissemination scheme in a social IoT environment.