A Novel Approach for Arabic Text Steganography Based on the “ BloodGroup ” Text Hiding Method

Steganography is the science of hiding certain messages (data) in groups of irrelevant data possibly of other form. The purpose of steganography is covert communication to hide the existence of a message from an intermediary. Text Steganography is the process of embedding secret message (text) in another text (cover text) so that the existence of secret message cannot be detected by a third party. This paper presents a novel approach for text steganography using the Blood Group (BG) method based on the behavior of blood group. Experimentally it is found that the proposed method got good results in capacity, hiding capacity, time complexity, robustness, visibility, and similarity which shows its superiority as compared to most several existing methods. Keywords-Information Security; Steganography; Hiding Message; Text Steganography; Arabic Stego


INTRODUCTION
Exchanging hidden information is an important domain of information security which includes different methods such as cryptography, steganography, etc. [1].Steganography is a newfangled approach of securing messages from outside interference/attacks in an exceptional way.This technique provides a way that anticipated recipient recognizes the subsistence of the message(s).This is just obscuring a file, message, image, or video within another.Steganography is a way of covering a message surrounded by an extra message, so that nobody can have a notion of its presence.Messages can be perceived by the pre-determined addressee only.In steganography the information is concealed in the cover media so no person observes the presence of the secret information.The working of steganography has been implemented on different medias like text, video, images, and sounds [2].
Fundamentally, stenographic techniques need to find insignificant bits of medium or cover files.Thus, any kind of modifications to those insignificant bits shouldn't damage the integrity of the cover medium.Nevertheless, undetectability might be generally implemented by adding invisible modifications to the cover file.For text steganography method, the stego text fidelity is generally utilized to evaluate and calculate the undetectability of the steganography method used.Nonetheless, fidelity describes the possibility of persons to discover variations between the stego and cover text.
Nonetheless, the cover text integrity is not conserved with steganography because some parts of the cover file needs to be modified or changed to be able to hide the secret message and obtain the stego file [3].Text stenography is not usually utilized since the files of text a have tiny amount of unnecessary data.
The aim of this paper is to describe a new method for text steganography through text and proposed a novel steganography technique for Arabic language called "Blood Group" stenographic method.

II. PREVIOUS WORKS
The first proposal in the Arabic text steganography was done in [4].Their schema is dependent on concealing binary values within Persian or Arabic scripts by using a characteristic of coding strategy.This approach is dependent upon the points inherited within the Persian, Urdu and Arabic letters.The points' location on the letters conceal certain data.The hidden data length (i.e. the Secret Object) is considered as a binary using the first several bits.The middle text, (i.e. the Cover Object) is scanned in such a way when a pointed letter is detected.The position of the point is shifted up slightly when the hidden value is 1otherwise, the location remains unchanged.The advantage of this method is the large amount information will be concealed in text due to the large number of points in letters for both Persian and Arabic.
Authors in [5] proposed a new steganography approach to conceal hidden secret data within Arabic text cover media.The suggested method employs Arabic language diacritics that are utilized for vowel sounds and it's located in several religious documents.About 8 of these symbols are found in Arabic.They discovered that the "Fatha" symbol has been used in Arabic text more than the other 7 symbols.Thus, they used the "Fatha" symbol to present 1 and the other symbols to present 0. The benefit of this method is the big capacity due to each Arabic character is relevant for a diacritic.The downside is that concealing some diacritics can get the reader's attention.In [6] the reverse "Fatha" was used to conceal data in the cover text rather than the regular "Fatha".This was not as easily noticed by third parties which is a benefit for this method.The downside of this method is the requirement for a new font to apply the "inverse Fatha" as it isn't a standard diacritic.
In [7], authors proposed a new method by using "Kashida".Several algorithms were developed and performed in a stego method named MSCUKAT (Maximizing Steganography Capacity Using "Kashida" in Arabic Text).The enhancements with this attempt involve maximizing the capacity of cover media to conceal more secret information, minimizing the file size that rise after hidden the secret and improving the security of the encoded cover media.It was shown that this method was superior over similar previous.In [8], eight different techniques that deal with using Arabic natural language to hide secret message were considered.The authors of this paper have also published a previous similar study in [9].

A. Architecture
The architecture of system is same as in [9], which is organized with two portions: the sender side that encrypt the secret message, compress it and then embed it and the receiver side that decodes the stego message and by using the same encrypted key.

B. Stego Module
The hiding process is used by the sender to hide the secret message into the cover text.This process involves select the input file, which represents the encrypted compressed secret message, and a group of sub processes as represented in (1).The proposed method is based on employing two stego options of "Kashida" and change the Unicode of the letter based on the behaviour of the blood group (ABO).: if the word is ‫,)در(‬ where (P=‫د‬ , C=‫,)ر‬ the result after applying isolated type on C letter, will be (0xFEAE).
Figure 1 shows the blood group behavior and Table I the mapping from AB to AB` for isolated characters.Behavior of blood group Stego method.The Stego algorithm of this method is described in follows:
Step 3: Check:  If (previous letter is from A) and (code of Kashida is exist) and (current letter is from A) then return 1.  If (previous letter is from B) and (code of Kashida is exist) and (current letter is from B) then return 1.  If (previous letter is from A) and (code of Unicode for current letter is exist) and (current letter is from AB) then return 1.  If (previous letter is from B) and (code of Unicode for current letter is exist) and (current letter is from AB) then return 1.  If (previous letter is from AB) and (code of Unicode for current letter is exist) and (current letter is from AB) then return 1.  Otherwise, return 0.
Step 4: Gather each 8-bits to get byte and then convert it to string.

IV. RESULTS
This part shows the experiments results used to evaluate the performance of the proposed system.The system has been utilized in C#.The tests were run in a workstation laptop (Dell) with the following specifications: CPU 1.8 GHz core i3, RAM , 4GB DDR3, OS Windows 8 64bit, Visual studio 2013

A. Capacity
At first we calculated the secret messages sizes before and after encryption and compression ratio to determine the changes and the gain from using compression to reduce the message sizes that will be improving the hiding process by reducing cover capacity needed.As shown in Table II, the results show that compression with (gzip) is very useful with large secret message that is be practically efficient.

B. Hiding capacity
The Hiding capacity (in bits/Bytes) need to hide two secret messages with fixed cover.The cover capacity needs for Blood Group stego method is shown in Table III.

C. Time Complexity
In order to calculate the actual time needed to embed with the Blood Group method a timer function was used.Table IV shows the embedding time needed for this stego method for 2 secret messages with a fixed cover message.

D. Robustness, Visibility, & Similarity
Robustness is the resistance of the steganography technique against modifying or destroying the secret message.Results are shown in Table V.

V. CONCLUSION
A hybrid method that combines both cryptography and steganography is considered in this paper.A novel algorithm called "BloodGroup" is proposed.The new scheme demonstrated rather good results in terms of capacity, speed, robustness, printing, copying and pasting, font changing, similarity, visibility and security but poor results in OCR which should be attributed to the poor level of the available Arabic OCR software.Further improvement is achieved when the GZIP compression technique is also employed.

TABLE I .
MAPPING FROM AB TO AB` FOR ISOLATED CHARACTERS№ Unicode Character Isolated Type Medium Type Combine the results and return.