Analysis

A discussion on covert DNS traffic

DNS is everywhere, but is it always being used for legitimate purposes? Did you know you can run a VPN over DNS for free WiFi?

Mathias Hedberg

Jun 21, 2020 — 22 min read

Photo by Joe Gadd / Unsplash

Introduction

The Domain Name System (DNS) has from its inception in the late 1980s, seen itself become an essential part of computer networking. Its operation is critical for most of the applications and protocols we interact with daily. Ensuring is proper operation is a must-win-battle for any network administrator. This seemingly simple protocol has some unintended uses which are often overlooked by network administrators. A DNS threat report from 2017 revealed that 25% of organizations in the US experienced some form of data exfiltration via DNS, where 25% of these had either customer data or intellectual data stolen. With so many systems being heavily entangled and reliant on the operation of DNS, adding restrictions to the system to increase security comes with the added risk of applications and services not running as they should, This results in a dilemma for network administrators who must balance security and functionality. Disabling DNS, even in locked down networks is often infeasible.

There are many examples of where internet-limited networks require functional DNS. Take the example of a WiFi hotspot with captive portal login. If a TLS is to be used in the user authentication process, then DNS must be present as HTTPS relies on it for authenticating the login page. Corporate networks may have sections of the network running mission critical services, hence heavy IP filtering in firewalls. However many of these services may require local host-name resolution or name based load-balancing, requiring a local DNS resolver to be present.

Allowing unrestricted DNS access can have some unintended consequences. This could be many things, such as botnets using DNS to to organize and orchestrate attacks, to more advanced threat actors using it as a side channel for malware command and control communication, potentially circumventing IDS/IPS mechanisms which are focused on monitoring other network protocols on the network. There are other issues such as DNS leakage, however the scope of this post will be limited to covert/side-channel communication over DNS.

The aim of this document is to help network administrators get a better understanding of DNS, and the unexpected consequences of unvetted and unchecked access to the protocol. Exposing how many of these malicious systems function may help network administrators apply this knowledge to protect their own networks. Although there are no one-size-fits-all solutions, having a thorough understanding of the issue may help with planning of network infrastructure, and help reduce incident response times.

Background

We will now take a deep dive into some background information relevant to get a better understanding of the topic at hand. Readers are expected to have a basic understanding of computer networking and some knowledge of DNS.

Command and Control

Malware, botnets, and other forms of malicious software often require some form of external communication for coordination and data ex-filtration. The remote command and control channel (C&C) is a defining characteristic of botnets. A significant amount of research has been done on detecting this traffic to aid in detecting malware infections. These advancements in detection mechanisms have caused botnet owners to take further steps to ensure the resilience of their botnets, especially in regard to their C&C communication. C&C is not limited to botnet use. Other more targeted malware may seek to extract sensitive information from networks, or require some human intervention to achieve further lateral movement within the network. In this case, strict firewall rules may be a challenge, meaning alternate forms of communication must be established to ensure continued operation. The C&C channel can then be used for data exfiltration, or to download additional malware to run on the infected network.

A popular model used to describe cyber-attacks is the cyber killchain (Yadav and Rao 2015), which was originally developed by Lockheed Martin. Although it is a descriptive model, it has proven to be a useful tool for describing the complex series of events taken by attackers (Case 2016) in large cyber incidents, including the command and control stage.

Most malware uses some form C&C as a part of remotely executed cyber-attacks. This takes place after the malware has been installed on an infected host. Command and control traffic plays a pivotal role in advanced malware, so it is in an attackers best interest to mask this traffic to remain hidden. It acts as both a method for receiving new instructions, but potentially also as a channel for data extraction.

The structure of C&C communication can usually be described as either centralized, decentralized or social network based. DNS based C&C implements both centralized and decentralized structures, mostly based on how it is implemented and used in different scenarios. DNS as a protocol in itself is a decentralized structure, however that structure categorization becomes less significant as the system starts exerting a more centralized queries. For example by setting the TTL of a DNS resource record to 0, the benefits of caching cannot be harnessed, meaning that queries to that domain become reliant on the availability of the name server zoning the domain, resulting in a structure that more or less could resemble a centralized structure.

Having established C&C means that an attacker can start examining the network from the inside. This could be anything from port scanning the internal network to extracting credentials for use in a different attack vector. Not having a communication channel to the malware means operating blindly, which is not ideal, however in some cases such as with the Stuxnet worm, C&C is physically not possible and the software must operate autonomously (Matrosov et al. 2010).

With both the risk and consequences of malware infection being relatively high, IT departments and security researchers are always on the lookout for indicators of compromise (IOCs). C&C traffic is usually a good indicator of some form of infection, which is why many organizations deploy some form of Intrusion Detection System (IDS) such as Suricata or Snort in their networks. These systems paired with advanced sensor rules have made it easier to detect malicious traffic from all stages of the cyber killchain, including C&C. This has caused malware developers to look for more covert options for their C&C communication. Creative alternatives for communication have been observed, such as receiving instructions from code hidden in twitter memes. This trend towards using covert communications channels make it difficult for IT administrators and security researchers to distinguish between malicious and legitimate network usage (Sheridan and Keane 2015).

In many cases, network access is severely limited, so the only option for C&C communication is by piggybacking off other legitimate protocols. DNS is a core internet function, available in most networks making it a prime target to leverage for C&C.

Known attacks using DNS for C&C

As mentioned in the introduction, DNS has played a part in many cyber attacks. We will now take a deep dive into some of the larger well known attacks which have been reliant on DNS for some form of C&C traffic.

Credit card details are something many people are reluctant to give away when doing online shopping, often only opting for a few trusted sites such as PayPal or Amazon for their online purchasing needs. Some people choose avoid online shopping all-together out of fear that their payment details will be stolen, this however does not mean they are except from cyber attacks.

The multi-billion dollar US retailer Home Depot was between April and September of 2014 the target of an extensive and target attack by criminals aiming to steal credit card information from their Point of Sales (PoS) systems (Rascagneres 2016). The cyber criminals managed to walk away with 56 million debit and credit cards numbers and 53 million email addresses belonging to customers of the company.

The attackers made use of a third-party vendor’s credentials to gain initial access to the network. Using this access, they managed to elevate their privileges in the network and maneuver towards the PoS systems (Roman 2020) where they then infected around 7500 self-checkout systems.

The surprising thing with this attack was that the malware stole credit card details by extracting them from memory from the PoS systems, differentiating itself from other typical credit card leaks where databases of users and their payment details are dumped from websites using some form of web hacking. The malware, given the name FrameworkPOS (Nadler, Aminov, and Shabtai 2019), operated autonomously on the PoS systems, and used custom made DNS malware to extract the credit card details over the course of the six month infection. Home depot estimates that this attack would cost them around USD 62 million, with customers being offered complimentary credit monitoring at their expense.

Moving over to some more well known actors, we have APT34, also known as OilRig, which is suspected to be an Iranian group, targeting mostly middle eastern victims. This group has been documented to make extensive use of DNS tunneling for command and control traffic (Falcone 2019b). OilRig has been observed using DNS tunneling for command and control since as early as May 2016, using the protocol in many of its known tools. In 2019 tools allegedly owned by OilRig were leaked, giving substantial insight into their function and the group’s modus operandi (Lee 2019).

These tools have been for various operations, such as attacks on Saudi Arabian financial and technological institutions. The attacks usually result in backdoors which are then used to gain sensitive information, using both HTTP and DNS for C&C.

A different APT known for using DNS for C&C is DarkHydrus which has been observed using DNS C&C in its targeted attacks on middle eastern government entities as late as 2019 (Falcone 2019a). This group makes use of well crafted spear-phishing emails to convince the target download and open their malware-ridden excel spreadsheets which then relied on excel macros to achieve infection.

The repeated use of DNS for C&C by large APTs shows that the technology is still a valid option for C&C traffic.

The inception of DNS

DNS is a product of the rapid expansion of the amount of hosts on the internet. Before its inception, all host name to address mappings were administered and maintained using a single HOSTS.TXT file which was FTPed to all hosts from what was known as the Network Information Center. This file saw explosive growth as the number of hosts on the internet grew. In addition, local networks were appearing which wanted some form of local host to address resolution. Applications were getting more advanced, and it was clear for many that a new system had to be implemented.

The result if this was a range of ideas and proposals, with the design of a distributed database for name resources described in (Mockapetris 1983) planting the seed for the further conceptual design (Mockapetris 1987a) and basic implementation (Mockapetris 1987b) of DNS. These memos laid down the foundations for what we know as DNS today. The system has undergone some evolutionary changes, to better support current use cases, the NOTIFY feature for zone transfers (Vixie 1996) and CNAME flattening (Prince 2018). Further advancements are constantly rolling out and seeing widespread adoption such as the infamous DNS Queries over HTTPS (DoH).

DNS conceptual design

To better understand how DNS can be utilized as a side channel for communication, it is important to familiarize one self with DNS from an implementation standpoint.

DNS is a protocol used to map domain names to IP addresses. The protocol used primarily port 53/UDP for communication, however 53/TCP may be used in some cases, such as when handling DNS responses larger than what a single UDP frame can handle (Dickinson et al. 2016). An extension was created for DNS called Extension Mechanisms for DNS (EDNS) which allows UDP payloads greater than 512 bytes as long as both parties support the extension.

DNS has three main components of its conceptual design, resource records (RR), name servers and resolvers. RRs are simply put the database entries present on a name server. A name server responds to queries from a resolver with the RRs that match the query parameters. Resolvers are the middle ground between software and the domain name system, resolving addresses on behalf of the requesting software. Machines on a network do not all need to have a full resolver running, but rather make use of stub resolvers (also known as forwarders), which forward the queries to a recursive resolver. ISPs usually host recursive resolvers for their customers, pushing the server addresses to clients via DHCP. Other popular public resolvers exist such as 8.8.8.8 (hosted by Google) and 1.1.1.1 (hosted by Cloudflare). Running a public recursive DNS resolver is a risky operation which could lead to the server being used by adversaries to perform DNS amplification attacks. There is still a surprising amount of public recursive resolvers, many of which have been the target for DNS poisoning and amplification attacks (Kührer et al. 2015).

When a resolver is tasked with performing a query for a specific domain name, it starts with the root servers making its way down the tree until it reaches the authoritative name server for the requested domain. It is essentially up to the owner of the domain to decide which servers to use as the authoritative name servers for that specific domain name. This could be with a hosting provider such as Amazon Route 53 or Cloudflare, however it is also possible to host these services manually on a home laptop or server using DNS server software such ad Bind. To avoid having the authoritative name servers being queried every time a client requests to resolve an address, results may be cached based on a time to live (TTL) value which is defined in the RRs at the discretion of the domain owner. Setting the TTL to 0 means that the results should not be cached and any new requests for the same resource record should result in the name server being queried again.

DNS Query protocol

DNS queries are in essence a request for a specific resource record from a name server that matches both a specific domain and record type. This could be for example a TXT record for www.yourdomain.com, or for a NS record for n1.sub.domain.com. In addition to this, a query must also specify the record class, however this is not relevant for the topic.

When querying for a RR, the response data field (RDATA) will vary in format based on the query type. Queries to RRs of ’A’ types respond with and IP address in the RDATA field, while other types such as ’TXT’ respond with character-strings. With the response also comes the time to live (TTL) value mentioned earlier, which defines how long (in seconds) the RR can remain in cache for use in future queries. The DNS lookup utility dig(1) is a great tool for interrogating DNS servers using any chosen query type and can help aid in learning how DNS functions.

An alternate use for DNS

Based on what we have learned in the previous section, it may be apparent that DNS can be utilized as an alternative communication channel. For example, a query could be done for the A RR for what-time-is-it.yourdomain.com. The authoritative name server for yourdomain.com may then respond with a CNAME response containing the current time, prefixed with the domain, such as the-time-is-1455.yourdomain.com. Anyone who has control over a domain name and a server on a public accessible IP can set up such a channel, assigning any internet connected server to the role as the authoritative name server for wither the domain itelf, or a subdomain of the domain as this is just another RR with the type being NS.

DNS tunneling

DNS tunneling is a non-standard method for exchanging data using the DNS protocol. DNS tunneling uses valid DNS queries to exchange other types of information to and from an authoritative name server.

Tunneling data over DNS has many uses, usually it is as a method to circumvent firewall rules, or as a way to hide traffic from monitoring systems. This makes it an ideal choice for tasks such as C&C communication where malware must operate in a hidden fashion behind firewalled networks.

With regard to circumventing firewall rules, captive portals for WiFi hotspot login is a prime example of this (Yassine et al. 2018). DNS is often functional in this scenario so that the login page can be resolved, and the user log in. The unintended consequence of this is that some networks don’t filter the DNS requests, allowing any name to be resolved. This means that communication with external name servers can take place, opening up for the use of IP over DNS systems. There even exist commercial VPN over DNS solutions for use in this exact scenario. Having unfiltered DNS is case for many popular hotspot solutions such as the modern Unifi captive portal system. This can easily be tested by querying the local dns server for a domain on the internet, while not authenticated for internet use.

In some cases, it can be used to avoid paying mobile network fees when roaming to get free internet access (Do et al. 2017). This is due to DNS not triggering the billing functionality in most mobile networks. The research also mentions how some dishonest mobile operators equip their customers devices with software to avoid the charging function when roaming abroad.

DNS is not only a method to circumvent restrictive firewalls, it may be used by malicious software for ex-filtrating sensitive information. Many organizations have little to no monitoring of DNS, and rather focus on web or email traffic as many attacks take place there (Farnham and Atlasis 2013). Surprisingly for many, the threats and risks associated with DNS tunneling are not something new. The topic of DNS tunneling was observed and discussed by researchers as early as 1998 (Farnham and Atlasis 2013).

Implementation

DNS tunnels have three main elements, server software running on an authoritative name server, client software on the device on which the tunnel should terminate to, and the domain name for which the queries should piggyback on. Critical for all implementations of such a tunnel, is the need to encode data in a format which is compatible with the DNS protocol.

When tunneling DNS there are two main data channels, the upstream data which is transported via DNS queries, and the downstream data which appears in the form of query responses. Upstream data to be tunneled via a DNS query can only be encapsulated in the question section of a DNS message. The question section has three components, the QNAME, QTYPE and QCLASS, as illustrated in figure 1. The type and class are only 16 bits each, while the QNAME is significantly larger. The QNAME is the domain name, so the upstream data must be encoded in a format that is compatible with the requirements of this field. Responses can be sent via response data types such as TXT, which are less restrictive and can contain more data.

DNS question section format (“INACON Protocol Help Protocol Help - DNS Question Section Format” 2010)

Whether or not the limitations set fourth by the specification are followed, is in reality at the discretion of the software developers. Size limits for DNS are described in RFC 1035 (Mockapetris 1987b). Per this memo, labels (such as ’www’ in ’www.example.com’) must be 63 octets or less, names (the full domain name such as ’www.example.com’) must be 255 octets or less, and the UDP messages themselves must be 512 octets or less. DNS is case insensitive and no significance is attached to the case, however the case should (per the RFC) be preserved wherever possible. Domain labels also have character limitations, they must only contain alphanumeric values and hyphens (a-z, 0-9 and ’-’). This is regulated even further so that labels must start with a letter and end with a letter or digit. Hyphens may only be used in the interior characters.

As mentioned earlier, the upstream data must be encoded in the QNAME field, which must follow these requirements. Simple text files may be trivial to encode, but other files such as binary must be encoded to text. Base64 can be challenging to use, as it is case sensitive and makes use of additional symbols outside of that allowed for a domain label. Custom encoding mechanisms must be used to compensate for these requirements. However it should be noted that some DNS servers have been observed to function with characters outside the RFC specified range (yarrick 2020).

The example shown below shows a simple dialogue being communicated over DNS. The message "how are you doing today?" is encoded using base32 and sent to the local resolver as a TXT query. Note that the message length results in no padding, which is important as ’=’ is not a supported character.

$ echo "how are you doing today?" | base32
NBXXOIDBOJSSA6LPOUQGI33JNZTSA5DPMRQXSPYK
$ dig txt NBXXOIDBOJSSA6LPOUQGI33JNZTSA5DPMRQXSPYK.example.com

The request is handled by the authoritative name server where the message is decoded. The response is then encoded and placed in the RDATA field of the reply.

;; ANSWER SECTION:
NBXXOIDB...PMRQXSPYK.example.com. 0 IN TXT "JEQGC3JAMZUW4ZIK"

$ echo "JEQGC3JAMZUW4ZIK" | base32 -d
I am fine

Impact of DNS over TLS/HTTPS

DNS over TLS (DoT) addresses some of the security issues present in the original DNS specification (Hu et al. 2016). It leverages TLS over port 853 to create a secure, confidential connection to a DNS server with DoT support. The benefit of this system is that it encrypts, and its biggest con is also that it encrypts. This can create some headaches for network administrators who want to inspect DNS traffic to detect rouge tunneling. This may lead to the protocol being blocked on some locked down corporate networks. DNS over HTTPS (DoH) presents an even bigger problem as it uses HTTPS on port 443/tcp as its carrier, which is a little more challenging to block without blocking all HTTPS traffic (unlike DoT where 853/tcp can be blocked). Queries are essentially standard https requests, making them almost impossible to differentiate from other traffic. An example of such a query is seen below:

curl 'https://dns.google.com/resolve?name=example.com&type=TXT'

Google for example hosts, DoH on its servers, meaning that blocking access to google’s DoH would in reality mean also blocking a range of other google services. Some network administrators may try to solve this issue by blocking the IP addresses of the DoH servers based on their name resolution, however this is not adequate due to the issue of domain fronting. Many of googles servers, even "blogger.com" will respond to dns queries if domain fronting to dns.google.com is performed (Ziaja 2019). An example of such a (functional) query is shown below.

curl -H 'Host: dns.google.com' 'https://blogger.com/resolve?name=example.com&type=TXT'

Firefox has already started their rollout of DNS over HTTPS by default for US users. This may become a challenge for some network administrators who wish to have more fine grain control over the DNS queries being performed on the network.

Software implementations

We will now look at some of the more popular software implementations which could be used for some form of covert command and control traffic.

In general, one can categorize side-channel communication over DNS into two categories, one being performance focused DNS tunnels, and the other being low throughput, security focused.

The approach chosen by a threat actor is often scenario specific. In some cases a noisy, high performance solution is just fine. This could be the case for low risk botnets where a lost node is manageable as there are thousands of others. Or the target may be in an unmonitored network with strong firewall rules. DNS may be the only way for inter-network communication. No response from network administrators is expected as there is no monitoring in place, hence a fast tunnel can be used.

Moving over to malware tailor made for specific targets, a solution with a smaller footprint may be chosen. The threat actor may be targeting some corporate network where extensive network monitoring is in place. Detection could mean that the foothold is lost, and an investigation may be started, closing any potentially open vectors which were used for initial infection. Treading lightly with a slower DNS C&C system may be the optimal solution in such a scenario.

The software which was analyzed in detail is taken from a list of recognized DNS toolkits compiled by Palo Alto Network’s own cyber security research group Unit42.

Iodine

Iodine is an open source set of software that allows easy and high performance tunneling of IPv4 data through DNS.

Iodine works by having both a server and client program to enable the tunneling. The server runs on a host which is both exposed to the internet on 53/udp, and assigned as the authoritative server for a specific domain name. This allows the endpoints to communicate using normal IP based protocols, such as UDP, TCP, ICMP etc. The client software runs on most platforms, and is needed to encode the data properly. It creates a tun device (virtual layer 3 interface) so that routing can be set up as desired. Iodine uses passwords for authentication, but not encryption. The client authenticates itself to the Iodine server, which in turn whitelists the IP address for future tunneling. Mobile clients with changing addresses must re-authenticate for each new IP address.

As there is no encryption/signing of traffic, meaning that the channel is vulnerable to trivial man-in-the-middle attacks. It is recommended to use some form of double tunneling (SSH socks proxy or VPN) to achieve adequate traffic security.

Upstream data is compressed with gzip and encoded using Base32/64/64u/128. The encoding used is based on the character-set supported by the DNS resolvers/subs. Some DNS servers support more characters than what is allowed by the DNS specification, resulting in Base64/128 being used whenever possible. Base64 is used if mixed case is preserved and ’+’ is allowed. Base64u if ’_’ is supported instead of ’+’. Base128 is used if high-byte-value characters can be used.

Iodine makes use of the DNS extension EDNS0 to allow using DNS packets longer than 512 bytes. Furthermore, it makes use of the NULL type (type value 10) and PRIVATE type (type value 65280-65534). The NULL type per RFC1035 (which describes the initial DNS implementation) is an experimental feature allowing any response as long as it is 65535 octets or less. Creating a signature for such requests is trivial, as using NULL queries is quite uncommon. DNS queries using private type values are also easy to create signatures for, as these are even more uncommon in their use. Iodine also functions with other types such as TXT, SRV, MX, CNAME and A. These are fallen back to if it detects limited query support, or if specified manually at runtime.

Iodine focuses on performance rather than discreteness, which becomes more apparent when analyzing the types of DNS queries it makes. Performance testing (Merlo et al. 2011) reveals stable, linear performance, with robust performance in different test scenarios. Its developers report speeds of up to 2.5 mbit/s down and 677kbit/s up on local setups.

There are currently publicly available snort rules for detecting Iodine traffic (Sheridan and Keane 2015). Iodine traffic is also quite easy to detect via the use of statistical analysis as the queried domain names have high entropy compared to normal internet domains (Born and Gustafson 2010). Testing has also shown that Iodine causes increased DNS traffic, even when the tunnel is idle (Sheridan and Keane 2015) mainly due to polling as a result of the connectionless nature of DNS.

TUNS

TUNS is an IP over DNS tunnel which only uses CNAME records, making it slower than other systems, with the benefit of being harder to detect. It also lowers its tunnel interface MTU so that IP packets are not split into multiple queries, but rather made to fit in a single DNS query.

Although TUNS does not reach the same levels of throughput as other software such as Iodine, it has an efficient polling system that reduces perceived latency (Aiello, Merlo, and Papaleo 2013). Latency is a challenge for all DNS tunneling systems, as the authoritative name server is expected to process and reply to DNS queries instantly. The TUNS server handles this by giving an empty reply, which tells the client to wait and come back later.

Feederbot

Feederbot (Dietrich et al. 2011) was a botnet discovered by security researchers in 2011. It uses DNS as its backbone for C&C traffic. The researchers were able to reverse engineer the malware to understand its C&C traffic protocol. It’s usage for any real attacks is currently unknown.

Feederbot uses valid DNS syntax, making use of TXT records for downstream communication. Compared to other solutions such as Iodine, the Feederbot malware implements encryption for its communication and is able to derive keys using domain name queries. Figure 2 is what the researchers show as an example query for such key derivation.

Feederbot domain name structure (Dietrich et al. 2011)

Feederbot does not make use of the system pre-configured resolver addresses, but make use of their own DNS resolvers. The reason for this is unknown, though there may be some benefits to such an approach.

The researchers tried to resolve the domain used by feederbot using a different resolver, which resulted in a NXDOMAIN response. This could potentially have triggered some form of alert system for the malware developers. We do not currently know if a functional name server was defined for the domain, or even if the domain was registered, however if this is the case, then the name server may see that a query is coming from outside their whitelisted address range, and respond with an NXDOMAIN while also alerting its developers. Passive DNS may have given a better indicator. Feederbot making use of it’s own resolver could also mean leaving fewer traces of their actions in logs (and potentially also circumventing any passive DNS systems).

Feederbot is optimized for large scale botnets. It receives instructions from a set of predefined C&C severs which redirect the malware to a set of other C&C servers which the malware should then use for further communication. The researchers only discovered two such servers, however this could be the aftermath of a bigger operation.

Feederbot message chunks (Dietrich et al. 2011)

Feederbot response RDATA is both RC4 encrypted and Base64 encoded, as illustrated in figure 3. It even uses CRC32 checksums (unencrypted), which is useful to check for packet errors as the encryption scheme used is a stream cipher, meaning reliance on each subsequent packet for proper decryption.

Protecting against DNS misuse

Covert communication channels such as DNS tunneling pose significant risk to businesses and networks that store and process sensitive data. Detecting and stopping such traffic in a timely manner may be the difference between a successful and an unsuccessful cyber attack.

In some cases, DNS with external lookups is not needed. The solution could be as simple as only allowing DNS queries which go through a dedicated local server, where only whitelisted domains are resolved (this is supported by popular DNS servers such as dnsmasq and Windows Server). This could be the local domain and any other domains required for proper network operation. Operating with a whitelist is however not always a viable option, especially with users which require internet access.

Networks where internet access is required should be carefully designed so that proper segmentation is implemented, separating critical systems from internet users. Forcing internet clients to use a local (monitored) resolver may help with detection of infected hosts making rogue queries.

There are currently a variety methods for detecting malicious DNS traffic, many make use of some form of statistical analysis and AI to get a better indication of when a form of DNS tunneling is in use (Born and Gustafson 2010) and have been relatively successful in their classification. Character frequency analysis may also be useful for classifying DNS traffic. Other work has been done on detecting tunnels based on DNS query and response message patterns with a relatively high success rate (Binsalleeh et al. 2014).

Many of the detection mechanisms focus on high throughput systems, which is often the case when doing some form of DNS tunneling. Covert, low throughput data exfiltration must be handled in a different way. Research (Nadler, Aminov, and Shabtai 2019) has been done in this field with promising results. This form of detection makes use of machine learning classifiers trained on both legitimate and malicious DNS traffic so that classification can be done using anomaly detection. This type of detection becomes more difficult as threat actors remove any throughput optimizations to rather focus on making their traffic look legitimate.

Barracuda Networks advertises DNS-tunnel blocking functionality as a feature of their firewalls. At the time of writing, public test data of the firewall performance against common DNS tunneling solutions is not available, but future work should be done on testing the detection capability of such a firewall against different DNS tunneling solutions. From what we have learned, DNS tunnels have some variance in how they function, so it is possible that the firewall is only able to stop the high speed tunnels.

Conclusion

DNS is a core internet function, critical for the function of many internet protocols. Its alternative use cases beyond what it was designed for (such as DNS tunneling) is something that may easily be overlooked when planning a network. IP tunneling over DNS allows users to bypass firewall rules, and malware to communicate covertly. Using DNS tunnels to avoid paying for hotspot or mobile network access although possibly unethical, is still a possibility (albeit being slow). We have seen that DNS has been used for data extraction in high profile data breaches. The tunneling software varies in design, and there is no one-size-fits-all solution for detecting and stopping such traffic. Network administrators can deploy different monitoring solutions, but some analysis would be required to properly categorize the traffic. Requiring external domain name resolution to go via a local monitored resolver is a good start. Having an understanding of how DNS malware functions may reduce incident response times, potentially stopping malware before serious damage can be done.