Clearwater DNS Usage¶
DNS is the Domain Name System. It maps service names to hostnames and then hostnames to IP addresses. Clearwater uses DNS.
This document describes
- Clearwater’s DNS strategy and requirements
- how to configure AWS Route 53 and BIND to meet these.
DNS is also used as part of the ENUM system for mapping E.164 numbers to SIP URIs. This isn’t discussed in this document - instead see the separate ENUM document.
If you are installing an All-in-One Clearwater node, you do not need any DNS records and can ignore the rest of this page.
Strategy¶
Clearwater makes heavy use of DNS to refer to its nodes. It uses it for
- identifying individual nodes, e.g. sprout-1.example.com might resolve the IP address of the first sprout node
- identifying the nodes within a cluster, e.g. sprout.example.com resolves to all the IP addresses in the cluster
- fault-tolerance
Clearwater also supports using DNS for identifying non-Clearwater nodes. In particular, it supports DNS for identifying SIP peers using NAPTR and SRV records, as described in RFC 3263.
Resiliency¶
By default, Clearwater routes all DNS requests through an instance of dnsmasq running on localhost. This round-robins requests between the servers in /etc/resolv.conf, as described in its FAQ:
By default, dnsmasq treats all the nameservers it knows about as equal: it picks the one to use using an algorithm designed to avoid nameservers which aren’t responding.
If the signaling_dns_server
option is set in shared_config
(which is mandatory when using traffic
separation), Clearwater will not use
dnsmasq. Instead, resiliency is achieved by being able to specify up to
three servers in a comma-separated list (e.g.
signaling_dns_server=1.2.3.4,10.0.0.1,192.168.1.1
), and Clearwater
will fail over between them as follows:
- It will always query the first server in the list first
- If this returns SERVFAIL or times out (which happens after a randomised 500ms-1000ms period), it will resend the query to the second server
- If this returns SERVFAIL or times out, it will resend the query to the third server
- If all servers return SERVFAIL or time out, the DNS query will fail
Clearwater caches DNS responses for several minutes (to reduce the load on DNS servers, and the latency introduced by querying them). If a cache entry is stale, but the DNS servers return SERVFAIL or time out when Clearwater attempts to refresh it, Clearwater will continue to use the cached value until the DNS servers become responsive again. This minimises the impact of a DNS server failure on calls.
Requirements¶
DNS Server¶
Clearwater requires the DNS server to support
- RFC 1034 and RFC 1035 - basic DNS
- RFC 2181 - clarifications to DNS
- RFC 2782 - SRV records
- RFC 3596 - AAAA records (if IPv6 is required).
Support for latency-based routing and health-checking are required for multi-site deployments.
Support for RFC 2915 (NAPTR records) is also suggested, but not required. NAPTR records specify the transport (UDP, TCP, etc.) to use for a particular service - without it, UEs will default (probably to UDP).
AWS Route 53 supports all these features except NAPTR. BIND supports all these features except latency-based routing (although there is a patch for this) and health-checking.
DNS Records¶
Clearwater requires the following DNS records to be configured.
- Bono
bono-1.<zone>
,bono-2.<zone>
… (A and/or AAAA) - per-node records for bono<zone>
(A and/or AAAA) - cluster record for bono, resolving to all bono nodes - used by UEs that don’t support RFC 3263 (NAPTR/SRV)<zone>
(NAPTR, optional) - specifies transport requirements for accessing bono - serviceSIP+D2T
maps to_sip._tcp.<zone>
andSIP+D2U
maps to_sip._udp.<zone>
_sip._tcp.<zone>
and_sip._udp.<zone>
(SRV) - cluster SRV records for bono, resolving to port 5060 for all of the per-node records
- Sprout
sprout-1.<zone>
,sprout-2.<zone>
… (A and/or AAAA) - per-node records for sproutscscf.sprout.<zone>
(A and/or AAAA) - cluster record for sprout, resolving to all sprout nodes that provide S-CSCF function - used by P-CSCFs that don’t support RFC 3263 (NAPTR/SRV)scscf.sprout.<zone>
(NAPTR, optional) - specifies transport requirements for accessing sprout - serviceSIP+D2T
maps to_sip._tcp.scscf.sprout.<zone>
_sip._tcp.scscf.sprout.<zone>
(SRV) - cluster SRV record for sprout, resolving to port 5054 for all of the per-node recordsicscf.sprout.<zone>
(A and/or AAAA) - cluster record for sprout, resolving to all sprout nodes that provide I-CSCF function - used by P-CSCFs that don’t support RFC 3263 (NAPTR/SRV)icscf.sprout.<zone>
(NAPTR, optional) - specifies transport requirements for accessing sprout - serviceSIP+D2T
maps to_sip._tcp.icscf.sprout.<zone>
_sip._tcp.icscf.sprout.<zone>
(SRV) - cluster SRV record for sprout, resolving to port 5052 for all of the per-node records
- Dime
dime-1.<zone>
,dime-2.<zone>
… (A and/or AAAA) - per-node records for dimehs.<zone>
(A and/or AAAA) - cluster record for homestead, resolving to all dime nodesralf.<zone>
(A and/or AAAA) - cluster record for ralf, resolving to all dime nodes
- Homer
homer-1.<zone>
,homer-2.<zone>
… (A and/or AAAA) - per-node records for homerhomer.<zone>
(A and/or AAAA) - cluster record for homer, resolving to all homer nodes
- Vellum
vellum-1.<zone>
,vellum-2.<zone>
… (A and/or AAAA) - per-node records for vellumvellum.<zone>
(A and/or AAAA) - cluster record for vellum, resolving to all vellum nodes
- Ellis
ellis-1.<zone>
(A and/or AAAA) - per-node record for ellisellis.<zone>
(A and/or AAAA) - “cluster”/access record for ellis
- Standalone application server (e.g. gemini/memento)
<standalone name>-1.<zone>
(A and/or AAAA) - per-node record for each standalone application server<standalone name>.<zone>
(A and/or AAAA) - “cluster”/access record for the standalone application servers
Of these, the following must be resolvable by UEs - the others need only be resolvable within the core of the network. If you have a NAT-ed network, the following must resolve to public IP addresses, while the others should resolve to private IP addresses.
- Bono
<zone>
(A and/or AAAA)<zone>
(NAPTR, optional)_sip._tcp.<zone>
and_sip._udp.<zone>
(SRV)
- Ellis
ellis.<zone>
(A and/or AAAA)
- Memento
memento.<zone>
(A and/or AAAA)
If you are not deploying with some of these components, you do not need the DNS records to be configured for them. For example, if you are using a different P-CSCF (and so don’t need bono), you don’t need the bono DNS records. Likewise, if you are deploying with an external HSS (and so don’t need ellis), you don’t need the ellis DNS records.
If your deployment is geographically redundant, then you need a DNS record per site for every cluster record mentioned above. For example, in a GR deployment with two sites, siteA and siteB, the requirements for Dime are:
* `dime-1.<zone>`, `dime-2.<zone>`... (A and/or AAAA) - per-node records for Dime (one record for each node in each site)
* `hs.siteA.<zone>` (A and/or AAAA) - cluster record for Homestead, resolving to all Dime nodes in siteA.
* `hs.siteB.<zone>` (A and/or AAAA) - cluster record for Homestead, resolving to all Dime nodes in siteB.
* `ralf.siteA.<zone>` (A and/or AAAA) - cluster record for Ralf, resolving to all Dime nodes in siteA.
* `ralf.siteB.<zone>` (A and/or AAAA) - cluster record for Ralf, resolving to all Dime nodes in siteB.
The exceptions to the above are Bono and Ellis.
Ellis doesn’t support geographic redundancy (or even there being more than one Ellis), so there’s no need to have multiple DNS records.
Bono needs to be able to contact the Sprout nodes in each site, so it needs to have a DNS record that can resolve to all Sprouts; the expected Sprout/Bono DNS records for a GR deployment with two sites, siteA and siteB, are described below (this only includes the S-CSCF records for simplicity).
* `bono-1.<zone>`, `bono-2.<zone>`... (A and/or AAAA) - per-node records for Bono (one record for each node in each site)
* `<zone>` (A and/or AAAA) - cluster record for Bono, resolving to all bono nodes in all sites - used by UEs that don't support RFC 3263 (NAPTR/SRV)
* `<zone>` (NAPTR, optional) - specifies transport requirements for accessing Bono - service `SIP+D2T` maps to `_sip._tcp.<zone>` and `SIP+D2U` maps to `_sip._udp.<zone>`
* `_sip._tcp.<zone>` and `_sip._udp.<zone>` (SRV) - cluster SRV records for Bono, resolving to port 5060 for all of the per-node records
* `sprout-1.<zone>`, `sprout-2.<zone>`... (A and/or AAAA) - per-node records for Sprout (one record for each node in each site)
* `scscf.sprout.<zone>` (A and/or AAAA) - cluster record for Sprout, resolving to all Sprout nodes in all sites that provide S-CSCF function - used by P-CSCFs that don't support RFC 3263 (NAPTR/SRV)
* `scscf.sprout.<zone>` (NAPTR, optional) - specifies transport requirements for accessing Sprout - service `SIP+D2T` maps to `_sip._tcp.scscf.sprout.<zone>`
* `_sip._tcp.scscf.sprout.<zone>` (SRV) - cluster SRV record for Sprout, resolving to port 5054 for all of the per-node records
* `scscf.sprout.siteA.<zone>` (A and/or AAAA) - cluster record for Sprout, resolving to all Sprout nodes in siteA that provide S-CSCF function - used by P-CSCFs that don't support RFC 3263 (NAPTR/SRV)
* `scscf.sprout.siteA.<zone>` (NAPTR, optional) - specifies transport requirements for accessing Sprout - service `SIP+D2T` maps to `_sip._tcp.scscf.sprout.siteA.<zone>`
* `_sip._tcp.scscf.sprout.siteA.<zone>` (SRV) - cluster SRV record for Sprout, resolving to port 5054 for all of the per-node records in siteA
* `scscf.sprout.siteB.<zone>` (A and/or AAAA) - cluster record for Sprout, resolving to all Sprout nodes in siteB that provide S-CSCF function - used by P-CSCFs that don't support RFC 3263 (NAPTR/SRV)
* `scscf.sprout.siteB.<zone>` (NAPTR, optional) - specifies transport requirements for accessing Sprout - service `SIP+D2T` maps to `_sip._tcp.scscf.sprout.siteB.<zone>`
* `_sip._tcp.scscf.sprout.siteB.<zone>` (SRV) - cluster SRV record for Sprout, resolving to port 5054 for all of the per-node records in siteB
Configuration¶
Clearwater can work with any DNS server that meets the requirements above. However, most of our testing has been performed with
The Clearwater nodes also need to know the identity of their DNS server. Ideally, this is done via DHCP within your virtualization infrastructure. Alternatively, you can configure it manually.
The UEs need to know the identity of the DNS server too. In a testing
environment, you may be able to use DHCP or manual configuration. In a
public network, you will need to register the <zone>
domain name you
are using and arranging for an NS record for <zone>
to point to your
DNS server.
AWS Route 53¶
Clearwater’s automated install automatically configures AWS Route 53. There is no need to follow the following instructions if you are using the automated install.
The official AWS Route 53 documentation is a good reference, and most of the following steps are links into it.
To use AWS Route 53 for Clearwater, you need to
Note that AWS Route 53 does not support NAPTR records.
BIND¶
To use BIND, you need to
- install it
- create an entry for your “zone” (DNS suffix your deployment uses)
- configure the zone with a “zone file”
- restart BIND.
Note that BIND does not support latency-based routing or health-checking.
Installation¶
To install BIND on Ubuntu, issue sudo apt-get install bind9
.
Creating Zone Entry¶
To create an entry for your zone, edit the
/etc/bind/named.conf.local
file to add a line of the following form,
replacing <zone>
with your zone name.
zone "<zone>" IN { type master; file "/etc/bind/db.<zone>"; };
Configuring Zone¶
Zones are configured through “zone files” (defined in RFC 1034 and RFC 1035).
If you followed the instructions above, the zone file for your zone is
at /etc/bind/db.<zone>
.
For Clearwater, you should be able to adapt the following example zone file by correcting the IP addresses and duplicating (or removing) entries where you have more (or fewer) than 2 nodes in each tier.
$TTL 5m ; Default TTL
; SOA, NS and A record for DNS server itself
@ 3600 IN SOA ns admin ( 2014010800 ; Serial
3600 ; Refresh
3600 ; Retry
3600 ; Expire
300 ) ; Minimum TTL
@ 3600 IN NS ns
ns 3600 IN A 1.0.0.1 ; IPv4 address of BIND server
ns 3600 IN AAAA 1::1 ; IPv6 address of BIND server
; bono
; ====
;
; Per-node records - not required to have both IPv4 and IPv6 records
bono-1 IN A 2.0.0.1
bono-2 IN A 2.0.0.2
bono-1 IN AAAA 2::1
bono-2 IN AAAA 2::2
;
; Cluster A and AAAA records - UEs that don't support RFC 3263 will simply
; resolve the A or AAAA records and pick randomly from this set of addresses.
@ IN A 2.0.0.1
@ IN A 2.0.0.2
@ IN AAAA 2::1
@ IN AAAA 2::2
;
; NAPTR and SRV records - these indicate a preference for TCP and then resolve
; to port 5060 on the per-node records defined above.
@ IN NAPTR 1 1 "S" "SIP+D2T" "" _sip._tcp
@ IN NAPTR 2 1 "S" "SIP+D2U" "" _sip._udp
_sip._tcp IN SRV 0 0 5060 bono-1
_sip._tcp IN SRV 0 0 5060 bono-2
_sip._udp IN SRV 0 0 5060 bono-1
_sip._udp IN SRV 0 0 5060 bono-2
; sprout
; ======
;
; Per-node records - not required to have both IPv4 and IPv6 records
sprout-1 IN A 3.0.0.1
sprout-2 IN A 3.0.0.2
sprout-1 IN AAAA 3::1
sprout-2 IN AAAA 3::2
;
; Cluster A and AAAA records - P-CSCFs that don't support RFC 3263 will simply
; resolve the A or AAAA records and pick randomly from this set of addresses.
sprout IN A 3.0.0.1
sprout IN A 3.0.0.2
sprout IN AAAA 3::1
sprout IN AAAA 3::2
;
; Cluster A and AAAA records - P-CSCFs that don't support RFC 3263 will simply
; resolve the A or AAAA records and pick randomly from this set of addresses.
scscf.sprout IN A 3.0.0.1
scscf.sprout IN A 3.0.0.2
scscf.sprout IN AAAA 3::1
scscf.sprout IN AAAA 3::2
;
; NAPTR and SRV records - these indicate TCP support only and then resolve
; to port 5054 on the per-node records defined above.
sprout IN NAPTR 1 1 "S" "SIP+D2T" "" _sip._tcp.sprout
_sip._tcp.sprout IN SRV 0 0 5054 sprout-1
_sip._tcp.sprout IN SRV 0 0 5054 sprout-2
;
; NAPTR and SRV records for S-CSCF - these indicate TCP support only and
; then resolve to port 5054 on the per-node records defined above.
scscf.sprout IN NAPTR 1 1 "S" "SIP+D2T" "" _sip._tcp.scscf.sprout
_sip._tcp.scscf.sprout IN SRV 0 0 5054 sprout-1
_sip._tcp.scscf.sprout IN SRV 0 0 5054 sprout-2
;
; Cluster A and AAAA records - P-CSCFs that don't support RFC 3263 will simply
; resolve the A or AAAA records and pick randomly from this set of addresses.
icscf.sprout IN A 3.0.0.1
icscf.sprout IN A 3.0.0.2
icscf.sprout IN AAAA 3::1
icscf.sprout IN AAAA 3::2
;
; NAPTR and SRV records for I-CSCF - these indicate TCP support only and
; then resolve to port 5052 on the per-node records defined above.
icscf.sprout IN NAPTR 1 1 "S" "SIP+D2T" "" _sip._tcp.icscf.sprout
_sip._tcp.icscf.sprout IN SRV 0 0 5052 sprout-1
_sip._tcp.icscf.sprout IN SRV 0 0 5052 sprout-2
; dime
; =========
;
; Per-node records - not required to have both IPv4 and IPv6 records
dime-1 IN A 4.0.0.1
dime-2 IN A 4.0.0.2
dime-1 IN AAAA 4::1
dime-2 IN AAAA 4::2
;
; Cluster A and AAAA records - sprout, bono and ellis pick randomly from these.
hs IN A 4.0.0.1
hs IN A 4.0.0.2
hs IN AAAA 4::1
hs IN AAAA 4::2
ralf IN A 4.0.0.1
ralf IN A 4.0.0.2
ralf IN AAAA 4::1
ralf IN AAAA 4::2
;
; (No need for NAPTR or SRV records as dime doesn't handle SIP traffic.)
; homer
; =====
;
; Per-node records - not required to have both IPv4 and IPv6 records
homer-1 IN A 5.0.0.1
homer-2 IN A 5.0.0.2
homer-1 IN AAAA 5::1
homer-2 IN AAAA 5::2
;
; Cluster A and AAAA records - sprout picks randomly from these.
homer IN A 5.0.0.1
homer IN A 5.0.0.2
homer IN AAAA 5::1
homer IN AAAA 5::2
;
; (No need for NAPTR or SRV records as homer doesn't handle SIP traffic.)
; vellum
; =====
;
; Per-node records - not required to have both IPv4 and IPv6 records
vellum-1 IN A 6.0.0.1
vellum-2 IN A 6.0.0.2
vellum-1 IN AAAA 6::1
vellum-2 IN AAAA 6::2
;
; Cluster A and AAAA records - sprout, homer and dime pick randomly from these.
vellum IN A 6.0.0.1
vellum IN A 6.0.0.2
vellum IN AAAA 6::1
vellum IN AAAA 6::2
;
; (No need for NAPTR or SRV records as vellum doesn't handle SIP traffic.)
; ellis
; =====
;
; ellis is not clustered, so there's only ever one node.
;
; Per-node record - not required to have both IPv4 and IPv6 records
ellis-1 IN A 7.0.0.1
ellis-1 IN AAAA 7::1
;
; "Cluster"/access A and AAAA record
ellis IN A 7.0.0.1
ellis IN AAAA 7::1
Restarting¶
To restart BIND, issue sudo service bind9 restart
. Check
/var/log/syslog for any error messages.
Client Configuration¶
Clearwater nodes need to know the identity of their DNS server. Ideally, this is achieved through DHCP. There are two main situations in which it might need to be configured manually.
- When DNS configuration is not provided via DHCP.
- When incorrect DNS configuration is provided via DHCP.
Either way, you must
- create an
/etc/dnsmasq.resolv.conf
file containing the desired DNS configuration (probably just the single linenameserver <IP address>
) - add
RESOLV_CONF=/etc/dnsmasq.resolv.conf
to/etc/default/dnsmasq
- run
service dnsmasq restart
.
(As background,
dnsmasq is a DNS
forwarder that runs on each Clearwater node to act as a cache. Local
processes look in /etc/resolv.conf
for DNS configuration, and this
points them to localhost, where dnsmasq runs. In turn, dnsmasq takes its
configuration from /etc/dnsmasq.resolv.conf
. By default, dnsmasq
would use /var/run/dnsmasq/resolv.conf
, but this is controlled by
DHCP.)
IPv6 AAAA DNS lookups¶
Clearwater can be installed on an IPv4-only system, an IPv6-only system, or a system with both IPv4 and IPv6 addresses (though the Clearwater software does not use both IPv4 and IPv6 at the same time).
Normally, systems with both IPv4 and IPv6 addresses will prefer IPv6, performing AAAA lookups first and only trying an A record lookup if that fails. This may cause problems (or be inefficient) if you know that all your Clearwater DNS records are A records.
In this case, you can configure a preference for A lookups by editing
/etc/gai.conf
and commenting out the line
precedence ::ffff:0:0/96 100
(as described at
http://askubuntu.com/questions/32298/prefer-a-ipv4-dns-lookups-before-aaaaipv6-lookups).