Information fusion‐based method for distributed domain name system cache poisoning attack detection and identification
2015; Institution of Engineering and Technology; Volume: 10; Issue: 1 Linguagem: Inglês
10.1049/iet-ifs.2014.0386
ISSN1751-8717
AutoresHao Wu, Xianglei Dang, Lidong Wang, Longtao He,
Tópico(s)Internet Traffic Analysis and Secure E-voting
ResumoIET Information SecurityVolume 10, Issue 1 p. 37-44 Research ArticleFree Access Information fusion-based method for distributed domain name system cache poisoning attack detection and identification Hao Wu, Corresponding Author Hao Wu wuhao@cert.org.cn National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorXianglei Dang, Xianglei Dang National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorLidong Wang, Lidong Wang National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorLongtao He, Longtao He National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this author Hao Wu, Corresponding Author Hao Wu wuhao@cert.org.cn National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorXianglei Dang, Xianglei Dang National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorLidong Wang, Lidong Wang National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this authorLongtao He, Longtao He National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT/CC), Beijing, People's Republic of ChinaSearch for more papers by this author First published: 01 January 2016 https://doi.org/10.1049/iet-ifs.2014.0386Citations: 3AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract In this study, the authors consider the detection and identification problems of distributed domain name system (DNS) cache poisoning attack. In the considered distributed attack, multiple cache servers are invaded simultaneously and the attack intensity for each cache server is slight. It is difficult to detect and identify the distributed attack by the existing local information-based detection methods, as the abnormal features for each cache server are indistinctive under distributed attack. To handle this problem, they propose an information fusion-based detection and identification methods. They find that the entropies of the query Internet protocol (IP) addresses for all cache servers are approximately stationary and statistically independent under normal cases. When distributed attack happens, they show the fact that the correlation of the entropies among all cache servers could increase dramatically. On the basis of this feature, they make use of principal component analysis to design the detection and identification methods. Specifically, attack is true when the maximum eigenvalue of the normalised entropies matrix exceeds a threshold, and the attacked servers are identified by the main loading vector. At last, they take a large-scale DNS in China and a simulation as two examples to show the effectiveness of their methods. 1 Introduction The domain name system (DNS) plays an important role on the Internet and many Internet services are closely related to the DNS. The main functionality of DNS is to translate easy-to-remember symbolic names into Internet protocol (IP) addresses. However, it has been found that DNS has many vulnerabilities and increasing attention is paid to the security and dependability of the DNS services [1-5]. Cache poisoning is one of the important and widely considered attacks that occur for DNS. The local cache server of DNS preserves the IP information of the latest queried domain names, which can reduce the number of the queries to root servers and further improve the working efficiency of the DNS. The DNS mainly adopts user datagram protocol with simple trust mechanisms, which bring many chances for the attackers. In [6], a Kaminsky cache poisoning attack was proposed, which has larger success rate and may cause greater damage than the traditional cache poisoning attacks. As a result of the Kaminsky attack, major DNS resolvers were quickly patched to prevent DNS poisoning. The patches include the use of random measures-source ports [7, 8], the use of random IP addresses for the name servers [7], randomising DNS queries by randomly 'case toggling' the domain names (0 × 20 encoding) [9] and adding a random prefix to the domain name [10] and so on. The objective of these patches is to add random challenges to requests and validate them in responses, which can efficiently reduce the success rate of the attack. However, it is found that there are ways to bypass the patches of the DNS resolvers located behind network address translator (NAT) devices [11, 12]. Compared with the defences by adding random challenges, the cryptographic defences, that is, mainly domain name system security extensions (DNSSEC), can absolutely defend the cache poisoning attack, even with man in the middle (MitM) attack. However, both end-points to the DNS transaction are required to adopt DNSSEC and denial of service attack can more easily intrude DNSSEC than the ordinary DNS. Although there have been many discussion on the adoption and deployment of DNSSEC, that is, [13, 14], it is still applied in a small scale and it is unclear when the massive deployment of DNSSEC can be implemented. We can observe that there are still potential insecurities for DNS. Different from the passive defence schemes [7, 9, 10], we mainly consider the online active detection and identification problems of cache poisoning attack here. Our objective is to accurately detect the attack event and identify the attacked cache servers online by exploiting the DNS queries and replies information. In [15], an effective entropy-based method was proposed for the detection of Kaminsky cache poisoning attack for single cache server, which is based on the fact that the entropy of the query IP addresses decreases in the attack process. When only one cache server is attacked, a large amount of malicious query packets need to be sent in order to guarantee a certain success rate, which, however, will make the attack be easily detected as the entropy has an evident downtrend. In this paper, we focus on a kind of distributed cache poisoning attack, where multiple cache servers are attacked simultaneously. In distributed attack, only a small number of malicious queries packets are sent to each cache server. From the perspective of the attacker, the probability of successful attacks is proportional to the number of malicious DNS requests. When all the malicious DNS requests are sent to a single cache server, they will cause large anomaly and the attack can be detected easily by the method in [15]. In contrast, in the distributed attack with a small number of packets to each cache server, the anomaly for each cache server is slight and the attack is difficult to be detected by the existing methods such as that in [15]. At the same time, the integral success probability can also reach a certain degree, as many cache servers are attacked and the total number of malicious DNS requests can still be large. In other words, the risk of being detected in the distributed attack is decentralised into multiple cache servers. Therefore new methods should be developed for the detection and identification of distributed cache poisoning attack. Different from the existing detection method based on the local information of one cache server [15], we present an information fusion-based method to detect and identify the distributed DNS cache poisoning attack. In distributed attack, we observe that although the entropy of the query IP addresses of each cache server does not change so significantly, the correlation of the entropies among all cache servers could have an obvious increase from normal to attack cases. On the basis of this feature, we make use of principal component analysis (PCA) to design the detection and identification methods. Specifically, attack is true when the maximum eigenvalue of the normalised entropies matrix exceeds a pre-set threshold, and a server is identified to be attacked when the amplitude of its corresponding element in the main loading vector exceeds another threshold. We also present an approach for setting the thresholds by considering the false and missing alarm rates. At last, we take a large-scale DNS in China and a simulation as two examples to show the effectiveness of our method. This paper is organised as follows. In Section 2, the problem we consider is discussed in a rigorous way. The detection and identification methods for distributed cache poisoning attack are presented in Sections 3 and 4, respectively. In Section 5, some analysis on the performance of the presented method will be first given, and then some design details and comparisons with other relative methods are provided. In Section 6, two examples are given to show the effectiveness of our methods. 2 Problem statements In this section, we will give a description of the considered problems in a rigorous way. As discussed in [15], when only one DNS cache server of a domain name is attacked, the entropy of the IP addresses of the query packets will have a decrease compared with the normal cases, as the attacker must insert a large number of malicious query packets using one or a group of fixed IP addresses during this time period. The entropy of the query IP addresses in a time period, denoted as x1, can be calculated as where Ci denotes the number of query packets from the i th IP, and q is the number of different query IP. Assume that x1 is approximately stationary in normal cases, which is represented by the following expression, (1) where is a constant, k denotes the time period, w1 (k) denotes a zero-mean value random stationary noise. If an attack is launched from time k0 to k1, then there is (2) where k0 ≤ k ≤ k1 and a1 (k) < 0. The attack detection method in [15] is then intuitive and can be represented by the following decision-making strategy (3) where δ1 < 0 represents the threshold, which is mainly determined by the a priori knowledge of w1 (k). From the perspective of the attackers, in order to guarantee a certain success probability, the number of the generated malicious query packets must be large enough. However, this high intensity attack will make the amplitude of a1 (k) be rather large and further makes a high detection probability. To deal with this easy detection problem, we introduce the following distributed cache poisoning attack. The cache servers of DNS are commonly deployed in a distributed fashion. Let S1, S2, …, SN be the DNS cache servers, where N represents their number. Let x1, x2, …, xN be the entropies of the IP addresses of the query packets in a time period in S1, S2, …, SN, respectively. Then at normal cases, for 1 ≤ i ≤ N, there is (4) In the considered distributed attack, multiple cache servers are attacked simultaneously. Let be attacked here, where 1 ≤ m1 < m2 < … < mn ≤ N. Let M = [m1, …, mn]. Then the expressions for the entropies can be represented by (5) where −ωi < ai (k) < 0 and ωi > 0, i ∈ M. A simplified representation for (5) can be given by (6) where x (k) = [x1 (k), …, xN (k)]T, , w (k) = [w1 (k), …, wN (k)]T, , here the mi th element of a (k) is equal to and other ones are zero. The constraints for ai (k) mean that the attack for each cache server is slight, which makes it difficult to detect the attack based on the detection method (3). Specially, when the false alarm rate is higher than the correct detection rate, that is (7) where represents the probability of the event given ai (k) ≠ 0, we then think that the detection method (3) is invalid for the attack. The problems we consider in this paper are summarised as follows: Problem 1: Since the detection method for single cache server attack is in low detection accuracy or even no longer valid, how to design effective method to accurately detect the distributed cache poisoning attack (6). Problem 2: How to identify the attacked cache servers set M from [1, 2, …, N]. Remark 1.To the best of our knowledge, all the existing results about the DNS poisoning attack detection mainly consider the local attack with single cache server attacked, which is different from the problem considered in our paper. 3 Attack detection In this section, we will give the detection method and also present how to set the detection threshold. 3.1 Detection strategy By observing (5) or (6), we find that although the attack for one cache server may be so slight that it is difficult to be detected, multiple attacks will make the correlation of the entropies of the query IP addresses among all cache servers have a relatively obvious mutation. In this paper, we mainly consider exploiting the correlation of the IP entropies to detect and identify the attack. PCA is a common tool for correlation analysis of multiple variables. For multiple time series xi = [xi (1), xi (2), …, xi (k)], i = 1, 2, …, N, let be the data matrix after normalisation. The matrix X can be decomposed into a score matrix T and a loading matrix P by the singular value decomposition (8) where λi, i = 1, 2, …, N represent the eigenvalues of the covariance matrix of X, or equivalently, the squares of the singular values of matrix X. The correlation of x1, x2, …, xN can be determined by matrix Λ. If the maximum of λi, i = 1, 2, …, N is relatively much larger than other eigenvalues, then x1, x2, …, xN have strong correlation; if all λi, i = 1, 2, …, N are similar, then x1, x2, …, xN have weak correlation. We now utilise PCA to design the detection method. Given horizon length l > N, let , and . At time k, we first construct the normalised data matrix X by (9) Then we adopt the following detection strategy (10) where λmax = max{λi (X), i = 1, 2, …, n }, δ represents the detection threshold and it is a positive constant. 3.2 Threshold setting We now consider how to set the threshold δ. Two important factors are considered for setting δ. The first one is false alarm rate, which represents the probability that the attack happening is detected when there is no attack actually. The second one is missing alarm rate, which is the probability that the attack happening is not detected when the attack happens actually. We will then present how to do the design for threshold δ by considering false alarm rate, missing alarm rate and both of them. When no attack happens, there is (11) Then λmax is a positive random variable, which is determined by the random noises wi (k − l + j), 1 ≤ i ≤ N, 1 ≤ j ≤ l. Given the threshold δ, the false alarm rate Rf (δ) can be given by (12) Therefore we can give a threshold δ such that the false alarm rate falls within the acceptable range. Assume that the false alarm rate is restricted to be not larger than εf. Then the corresponding threshold δ can be given by (13) The analytical calculation of (13) is impractical, as the complex relationship between λmax and multiple random variables wi (k − l + j). Here we adopt Monte Carlo numerical algorithm to approximately solve (13). First, we do sampling for matrix X according to (11) and assume that there are totally Z samples. Then for each sample, we calculate its maximum eigenvalue λmax. Then we actually obtain the samples of λmax, so we can solve (13) by (14) Now we discuss how to select the threshold δ by considering the missing alarm rate. It is more difficult to consider the missing alarm rate than false alarm rate, because the former is related to the attack. For convenience, we make an assumption that the attacks on all cache servers are the same, that is (15) where a is a negative constant. Given the threshold δ, the missing alarm rate Rm (δ, a) can be presented by (16) We further make a definition named maximum tolerable attack a0, which means that if a0 < a < 0, then this attack is tolerable and it would not bring serious damage; if a ≤ a0, then this attack is intolerable and it is required to be detected. In the following, we mainly focus on missing alarm rate Rm (δ) under maximum tolerable attack a0, that is (17) Assume that the missing alarm rate is restricted to be not larger than εm. Then the corresponding threshold δ can be given by (18) The solution of (11) can also be approximately calculated by Monte Carlo algorithm, expressed by (19) We now simultaneously consider the false alarm rate and missing alarm rate for threshold design. We first use an example to show how the threshold δ affects the false alarm rate and missing alarm rate. Let N = 3, l = 6 and let wi (k − l + j), 1 ≤ i ≤ N, 1 ≤ j ≤ l follow the standard normal distribution. Let the first and the second cache servers are attacked and a0 = − 1. Then by Monte Carlo sampling and wavelet-based denoising, the probability density functions of λmax without and with attacks are shown in Fig. 1. It shows from Fig. 1 that a large δ will make a small false alarm rate but large missing alarm rate, whereas a small δ will make a large missing alarm rate but small false alarm rate. The selection of δ is actually a trade-off between the false and missing alarm rates. Fig. 1Open in figure viewerPowerPoint False and missing alarm rates with threshold δ 4 Attack identification In this section, we pay attention on the problem of identifying the attacked cache servers [m1, m2, …, mn]. As discussed in Section 2, it is difficult to identify the attacked cache servers locally, so we also try to exploit the integrated information of multiple cache servers to implement the attack identification. 4.1 Identification strategy When no attack happens, the differences of the eigenvalues of data matrix X are slight, and the scores of all loading vectors in P from PCA (8) are similar. When attack happens with (15) and a relatively large eigenvalue is detected, we want to address the fact that the main principal component vector in P, denoted by , is close to the attack vector a (k). In other words, an element in the main principal component vector with large amplitude indicates that the corresponding cache server is attacked with high probability. Therefore we give the following identification strategy, for i = 1, 2, …, N, (20) where ρ represents the threshold. 4.2 Threshold setting Now we consider how to determine ρ. Similar to the detection, we also consider the false and missing alarm rates. We first consider ρ based on the false alarm rate. When no attack happens, Pmax is a random vector related to the noises matrix W. The false alarm rate for each cache server Si is the probability of the event that Si is judged to be attacked when there is actually no attack for Si, which can be represented by (21) It is obvious that are the same for all i = 1, 2, …, N. Given a false alarm rate , the threshold ρ can then be determined by (22) Equation (22) can also be solved by Monte Carlo algorithm similar to the one for (14) and we ignore the details here. We consider ρ based on the missing alarm rate. Missing alarm rate for each cache server Si represents the event that an attack is not detected when it actually happens to Si. The fact that , i = 1, 2, …, N are the same obviously holds true. Under maximum tolerable attack a0, there is (23) Assume that the missing alarm rate is restricted to be not larger than . Then the corresponding threshold ρ can be given by (24) which can be solved by the Monte Carlo algorithm similar to (19). The discussion on ρ by considering both false and missing alarm rates is almost the same as that for detection, and we also ignore it here. 5 Further discussion In this section, we will give some performance analysis of our method; after that, we will provide a detailed design of the distributed Kaminsky DNS poisoning attack detection and identification methods presented above and also make some comparisons with other related methods. 5.1 Detection and identification performance analyses Without loss of generality, we assume that the attack intensity a (k) is time invariant, that is, there is a (k) = a, and there are ai (k − l + j) = ai and for i ∈ M and j = 1, 2, …, l. Define the covariance of the noise wi (k) by Wi, i = 1, 2, …, N and assume that W1 = W2 = … = WN = W. Then we have the following conclusion. Theorem 1.The following results hold: 1) When ai = 0 for i ∈M, there is . 2) When ai ≠ 0 for i ∈M, there are and . Proof.It can be easily seen that 1) When ai = 0 for i ∈ M, based on the law of large numbers, there is where I is an identity matrix with appropriate dimensions. Therefore the conclusion holds. 2) When ai ≠ 0 for i ∈M, there is , where matrix Ω satisfies Then it can be easily derived that and Therefore the results of 1) and 2) hold true and the proof ends.□From Theorem 1, we can see that there must be λmax (ai ≠ 0) > λmax (ai = 0), that is, the attack can be definitely detected and identified by our method when the collected data with long enough time are exploited even at the case where only one cache server is attacked. 5.2 Detailed design As the detection is in a time-driven way in our method, let tk, k = 1, 2, … denote the time when the detection method is executed. First, each DNS cache server should add a simple local data processing unit. Each data processing unit copies all query packets of local cache servers from time tk to tk+1 and at the same time the IP addresses of all query packets are extracted and preserved. At the same time, a table is maintained: the table H contains all different IP addresses of the query packets and their numbers from time tk to tk+1; see Tables 1 and 2 as an example. On the basis of table H, the entropy of the local query IP addresses of each cache server from time tk to tk+1 is calculated. After that, all data processing units send the current local entropies to a central processing unit at time tk+1 simultaneously. Table 1. All different IP addresses of the query packets and their numbers from time tk to tk+1 Query IP addresses Numbers 111.21.1.121 5 111.44.2.123 3 111.80.3.105 2 111.10.4.101 … Table 2. Main loading vector Pmax at time 100 S1 S2 S3 S4 S5 S6 Pmax 0.0912 0.6621 0.4488 0.2994 0.5119 0.0158 Second, the central processing unit uses the collected data to execute the detection method in Section 3 at time tk+1. If λmax < δ, then nothing should be done; otherwise, generate an alarm that an attack is happening. Then the central processing unit executes the identification method in Section 4 to tell which cache servers are under attack. If |Pmax (i)| < ρ, then the i th cache server is not under attack; otherwise, it is believed that the i th cache server is under attack. At last, if an attack is detected, the central processing unit sends alarms to all cache servers under attack. Then each cache server should do the following two works: (i) inspects the query IP addresses from time tk to tk+1 with largest numbers and verify that whether their responses are NXDOMAIN. If it is true, then the clients from these IP are regarded to be attackers and their requests are refused in the future and (ii) inspects all existing records in the cache server that preserved by the requests of the attackers to verify and find the successful poisoning attacks. 5.3 Comparisons with other methods In our paper, we fuse the information of multiple DNS cache servers to detect the anomaly caused by the poisoning attack, which can be seen as a collaborative approach, as multiple different cache servers are involved in the detection in a cooperative way. In the following, we make a comparison with other kinds of collaborative approaches for DNS poisoning attack, for example, [16]. In [16], the DNS relevant data are captured by sensors at the network edge and then collected by a central processing unit. Then an anomaly can be effectively detected if the number of non-existent domain responses goes beyond the normal range. Compared with this kind of collaborative approaches, we consider a kind of distributed DNS poisoning attack with multiple cache servers attacked and further exploit the correlation of the DNS data from different cache servers to detect the attack. Our detection method may have the following differences or advantages: (1) By PCA, our detection method can effectively reduce the influence of the data noise on the detection performance. Therefore the distributed attack can be detected more accurately and quickly compared with that in [16], where the data noise is without any processing and there could be large false or missing alarm rate. We also added a theorem, that is, Theorem 1 in the revised paper to prove that based on our method the attack can be definitely detected when the collected data with long enough time are exploited. (2) On the basis of the information fusion strategy, the cache servers under attack can also be identified from our identification method, whereas this problem was not considered in [16] and may not be directly and easily derived from the existing results of [16]. (3) As at each time slot only one numerical value is sent to the central processing unit by each cache server in our method, the communication traffic is much smaller than that in [16], where all DNS data need to be transferred. The detection strategy in [17] is executed by matching the inbound DNS responses against the outbound DNS requests. A poisoning attack is detected when a DNS response does not match any of the outbound DNS requests. After that, three DNS requests are sent by the resolver and further matching on the validation fields and response order is done, which can remove the future invalid DNS responses to the attacked DNS request and so reduce the probability of successful attacks. Compared with the detection method in [17], our method may have the following differences: (1) By the matching, the method in [17] can effectively detect the unsuccessful attacks, however, when the DNS responses match the corresponding DNS requests, that is, attack succeeds the method in [17] will be ineffective even though the probability of successful attacks is very low. In contrast, the attack can be detected by our method regardless of whether the attack succeeds or not, as we utilise the information of query IP addresses and there will be an anomaly with either successful or unsuccessful attacks. (2) As fused information is exploited in our method, we can find more information about the attack, for example, we can find that the attacks on multiple cache servers are actually launched by the same attacker or we can identify the area that the attackers are most likely to intrude. (3) In [17], each response must undergo a matching process even the regular one, which will cause a delay for the domain name resolution, whereas the detection of our method has no influence on the regular domain name resolution as the detection and resolution are parallel and independent in our method. 6 Illustration In this section, we will give two examples to illustrate the effectiveness of our proposed detection and identification methods, especially making a comparison with the existing detection method for single cache server attack. 6.1 Example 1 We take a domain name in China with super-large scale as an example and six cache servers are taken into account. Fig. 2 shows the entropies of the query IP addresses of the cache servers S2, S3, S4 and S5 for 100 min at the normal cases, which contain 20 points and the time period of each point is 5 min. It can be seen that the entropies are almost stationary. Fig. 2Open in figure viewerPowerPoint Entropies of IP in the cache servers S2−5 at normal cases a S2 b S3 c S4 d S5 For another new 100 min, we insert malicious queries for S2, S3, S4 and S5 from the 60th minute, which makes a1 (k) = 0, a2 (k) = − 0.3, a3 (k) = − 0.3, a4 (k) = − 0.3, a5 (k) = − 0.3 and a6 (k) = 0, for 60 ≤ k ≤ 100. That is, S2, S3, S4 and S5 are attacked from time 60 to 100. In the attack mode, the entropies are shown in Fig. 3. From Fig. 3, we can see that it is difficult to detect the attack based on the local information in S2, S4 and S5 as the downtrends of the entropies in them are indistinctive, that is, it will cause large false alarm rate and missing alarm rate when we do the detection and identification based on the local information of one cache server. Fig. 3Open in figure viewerPowerPoint Entropies of IP in the cache servers S2−5 when attack happens for servers S2−5 from time 60 to 100 a S2 b S3 c S4 d S5 We present the detection results based on our detection method in Fig. 4. It shows that from time 60 to 100, there is a dramatical increase of the maximum eigenvalue, which represents that there is a large increase of the correlation of the entropies among the cache servers. Besides, the growth rate of the correlation from the threshold δ = 1.6 to the one at time 100 is about (2.6 − 1.6)/1.6 × 100% = 62.5%, which makes it can detect the attack with a big confidence level. In Table 1, the main loading vector is given. We can see that the second, the third, the fourth and the fifth elements have big amplitudes, which indicate that S2, S3, S4 and S5 are attacked. In summary, the attack event can be correctly detected based on our presented detection method and the attacked cache servers can also be accurately identified based on our identification method. Fig. 4Open in figure viewerPowerPoint Detection results of the attack with occurring time from 60 to 100 6.2 Example 2 To verify our presented method for the cases with larger number of cache servers than Example 1, we now consider a simulation example. Assume that there are totally 50 cache servers S1 −50 for a domain name and 30 cache servers S1 −30 are attacked from time 100 to 200. The entropies of the IP of each cache server are determined by a Gaussian distribution with mean ten and variance 0.25 at normal cases. The attacks for S1 −30 make ai (k) = − 0.5 for i = 1, 2, …, 30, 100 < k ≤ 200. We choose S1 and S2 as two examples and show their entropies in Fig. 5. It can be seen that the attack makes the abnormal changes very slight and it is very difficult to detect the attack based on the local information of each single cache server. This conclusion also holds for other 28 cache servers S3 −30. Fig. 5Open in figure viewerPowerPoint Entropies of IP in the cache servers S1 and S2 with attack time from 100 to 200 a S1 b S2 For our method, we select the horizon length l = 60 and the detection results are shown in Fig. 6. It can be observed that the attack can be quickly and accurately detected (detection time is 104 from Fig. 6). The advantage of our method is obvious compared with the detection method based on the local information. Besides, the confidence level of the detection results becomes higher and higher from time 104 to 160, as the value of λmax has a bigger and bigger difference from the threshold. From time 161 to 200, λmax tends to be stationary, which is because in the normalised data matrix X, all the elements are invaded by the attack. Fig. 6Open in figure viewerPowerPoint Detection results of the attack with occurring time from 101 to 200 In Fig. 7, the main loading vectors at times 104, 120 and 160 are presented. We set the identification threshold as 0.6. From Fig. 7 a, there are six cache servers for missing alarm and five cache servers for false alarm. From Fig. 7 b, there are two cache servers for false alarm. Meanwhile, it can be seen from Fig. 7 c that the attacked cache servers are correctly identified absolutely. In summary, from time 104 to 160, the attacked cache servers can be identified more and more accurately, which is because the correlations become larger and larger as time goes by. Fig. 7Open in figure viewerPowerPoint Identification results of the attack at times 104, 120 and 160 a Time 104 b Time 120 c Time 160 7 Conclusion In this paper, we present an information fusion-based method for distributed DNS cache poisoning attack detection and identification. We make use of PCA to design the detection and identification methods. Specifically, attack is true when the maximum eigenvalue of the normalised entropies matrix exceeds a threshold, and the attacked servers are identified by the main loading vector. We also present the approach for setting the thresholds by considering the false and missing alarm rates. At last, two examples are exploited to verify the effectiveness of our method. 8 Acknowledgment This work was supported by Key Technologies R&D Program of China under grants 2012BAH46B02. 9 References 1Fetzer, C., Pfeifer, G., Jim, T.: 'Enhancing DNS security using the SSL trust infrastructure'. The Tenth Int. Workshop on Object-Oriented Real-Time Dependable Systems, 2005, pp. 21– 27 2Afonso, J., Veiga, P.: 'Enhancing DNS security using dynamic firewalling with network agents'. Federated Conf. on Computer Science and Information Systems, 2011, pp. 777– 782 3Chandramouli, R., Rose, S.: 'Open issues in secure DNS deployment', IEEE Secur. Priv., 2009, 7, (5), pp. 29– 35 (https://doi.org/10.1109/MSP.2009.129) 4Ji, H.: 'Research on design and security strategy of DNS', Appl. Mech. Mater., 2013, 378, pp. 510– 513 (https://doi.org/10.4028/www.scientific.net/AMM.378.510) 5Yuan, L.: 'A proxy view of quality of domain name service, poisoning attacks and survival strategies', ACM Trans. Internet Technol., 2011, 12, (3), pp. 1– 26 (https://doi.org/10.1145/2461321.2461324) 6Kaminsky, D.: 'Its the end of the cache as we know it'. Black Hat Conf., 2008 7Hubert, A., van Mook, R.: ' Measures for making DNS more resilient against forged answers', RFC 5452, 2009 8 CERT: ' Multiple DNS implementations vulnerable to cache poisoning'. Technical Report Vulnerability Note, 800113, 2008 9Dagon, D.: 'Increased DNS forgery resistance through 0 × 20 bit encoding: security via leet queries'. ACM Conf. on Computer and Communications Security, 2008, pp. 211– 222 10Perdisci, R., Antonakakis, M., Day, K., Luo, X., Lee, W.: 'WSEC DNS: protecting recursive DNS resolvers from poisoning attacks'. DSN, 2009, pp. 3– 12 11Herzberg, A., Shulman, H.: ' Security of patched DNS, lecture notes in computer science' ( Springer, 2012), pp. 271– 284 12Herzberg, A., Shulman, H.: 'Fragmentation considered poisonous'. IEEE Conf. on Communications and Network Security, 2013 13Herzberg, A., Shulman, H.: 'Towards adoption of DNSSEC: availability and security challenges'. IEEE Conf. on Communications and Network Security, 2013 14Bau, J., Mitchell, J.C.: 'A security evaluation of DNSSEC with NSEC3'. Network and Distributed Systems Security Symp., 2010 15Musashi, Y.: 'Detection of Kaminsky DNS cache poisoning attack'. The Fourth Int. Conf. on Intelligent Networks and Intelligent Systems (ICINIS), 2011 16Zdrnja, B., Brownlee, N., Wessels, D.: 'Passive monitoring of DNS anomalies'. The Fourth Int. Conf., DIMVA 2007 Lucerne, Switzerland, 2007 17Herzberg, A., Shulman, H.: 'Unilateral antidotes to DNS poisoning'. The Seventh Int. ICST Conf., SecureComm, London, 2011 Citing Literature Volume10, Issue1January 2016Pages 37-44 FiguresReferencesRelatedInformation
Referência(s)