Archive for May, 2010

DNS records, TTLs and SORBS.

May 23 2010 Published by Michelle Sullivan under DNS

Another common point of contention over SORBS’ policies is the perceived TTL requirement.

The TTL issue is only a requirement on PTR record for Single IP delistings, and even then it is only a requirement for delisting from the SORBS DUHL database.

So what is a TTL and why is it an issue?

TTL means ‘Time To Live’ and is the time a record is held as valid and cached before rechecking.  TTLs apply to many things such as Web Pages and DNS records, where web page TTLs are used to tell your browser not to reload the page from the server for a minimum amount of time.  Browsers can ignore this value, but it is recommended that they do not.  In DNS it means when you enter “http://www.sorbs.net/” the computer knows to lookup www.sorbs.net, contact the host on port 80 and request a web page called www.sorbs.net.  In technical terms this is done in the following steps completely transparently to the user:

First the browser looks at the protocol requested (in this case http) and knows it is a HTTP (web) request, as there is no port specified it knows to use the default port of 80.

Next it grabs the hostname and looks up www.sorbs.net by sending the request to the local DNS server(s) using “Give me the A record for www.sorbs.net”. The DNS server will then consult it’s authoritative records, and if it’s not a SORBS name server will not find it.  The server will then consult it’s “Cache” entries and if it find an entry check the TTL.  The TTL will consist of two parts, the value and the time it was retrieved last, if the value added to the time retrieved is in the past the server will expunge the record and check it’s “root zone” file.  If the TTL+retrieval time is in the future it returns the record to host which in turn gives it to the browser.

The ‘root zone’ file is a list of ‘hints’, hints are a static list of where to go to find answers to the answers, in this case the hints will return the list of root servers which when queried will return NS records for the domain ‘sorbs.net’, they will also send the IP addresses associated with the names in the NS records.  These NS records can then be used to locate the servers with the authoritative information for the domain ‘sorbs.net’.  The resolving server (the local one to the browser) will then ask the authoritative servers for ‘sorbs.net’ for the A record for ‘www.sorbs.net’, the result, the time it was retrieved and the TTL of the record will then be added to the local ‘cache’ to save going through the process again for any other requests for the same information within the same time.  The resolving server then sends the information back to the host with the browser which in turn passes it to the browser.

At this point the browser has the request for ‘http://www.sorbs.net/’ and a DNS record that says ‘www.sorbs.net is an A record with value 111.125.160.134′ and will remember the record for the next 600 seconds (600 seconds being the TTL of www.sorbs.net).  The browser will then use the record to make an HTTP request from the server 111.125.160.134 on port 80 for the web page / on host www.sorbs.net.  The server sends the result and the browser will display the result to the user.

Nice and simple you might think, and so nice that it’s all done completely transparently in the background so you don’t have to know what is going on.  So why is there an issue with the TTL?  Well there are many records in DNS, not just the ‘www’ records, some will tell you what the hostname of the machine is, some will tell you where to send email to, and as we have already seen, some will tell you where you can get more information from.  TTLs just tell you how long that information is valid for.

SORBS uses this information to approve or deny requests, and has on a number of occasions had people try to fake the information for malicious purposes (eg Spammers getting delisted then faking the information to divert attention to someone innocent when they send spam)  For this reason we mandated that if we get requests for delisting of single IPs from the SORBS database we would require some minimum requirements, that being the information is valid for at least 12 hours and preferred 24 hours.  We believe that it should be valid for longer, but operationally this doesn’t make sense.  If a host is part of a cluster of servers and one of those servers has a problem, the administrators don’t want 100′s of people remembering where it is for hours at a time, they want all the requests to go to the remaining servers, and for this reason we don’t require TTLs to be any minimum value on the MX (Mail eXchanger) or A (Address) records, on the PTR (reverse PoinTeR) records.

Now why can the PTR records be high without them being a problem, well consider the following setup for SORBS.

SORBS has 7 mail servers in it’s data centers. 4 of these are ‘Mail eXchanger’ (MX) servers, 2 of them are higher priority than the others and they are as follows:

desperado.sorbs.net priority 10

scorpion.sorbs.net priority 10

catapillar.sorbs.net priority 5

anaconda.sorbs.net priority 5

The lower the number the higher the priority, and therefore all email for sorbs.net should get sent to catapillar.sorbs.net and anaconda.sorbs.net, only getting sent to scorpion.sorbs.net and desperado.sorbs.net if the first to are too busy to answer requests.  Now MX records give a list of hosts where the servers handle email for SORBS.net are, and those hosts when looked up will provide IP addresses (in the same way that the browser will request a webpage as described above.)

If anacoda.sorbs.net were to suffer a real problem (eg the power supplied fried and caught fire taking out the whole host) the remote servers would try it every second time, time out and then retry with catapilla, if catapilla is too busy it’ll fallback to either desperado or scorpion.  This means mail might be delayed so we would probably want to update the server list to exclude anaconda from the list, and if we have set the TTL to 86400 it means some servers will remember that anaconda is one of the server for upto 1 day.  Setting it longer results in a bigger delay for the change.  This is of course undesirable and is often used as the excuse for not changing the TTLs to the length SORBS requires for single IP delistings.

The argument is flawed.

SORBS requests only the PTR records be set to 86400 seconds (1 day) and not the A or MX records, so if you want to move the servers, or reconfigure because of outage, or other issue, you can do what ever you want.  The PTR record is the reverse PoinTeR record that translates the IP address back to a hostname.  The PTR record is used when your host contacts one of SORBS servers if you try to send email to it.  It is also used (for records) when you request a webpage, or when you register on SORBS.  How is it used you might ask..?  Well in most cases it’s just recorded to prevent/identify abuse, in other cases it might be used to block or allow access.  So as you can see it is not used for anything that would affect you operationally (unless you are trying to abuse something) so therefore the TTL should not really matter.  In fact the only time wehave ever seen a TTL matter is when a mistake is made, and the operator/admin wishes to correct the mistake and hosts remember the mistake for hours/days.

Many people use the “but I need to migrate my networks” excuse to try to invalidate the policy.  There are two reasons why this is not valid:

  • First, you don’t migrate every day, and when you do migrate you should be planning it well in advance so you change the TTLs on everything to smaller and smaller ones as the migration approaches.
  • Second, remember the PTR record is only used when your server contacts our server(s) so if you move that server to another IP the new IP’s PTR record should be setup accordingly (and in advance) then when the migration takes place the old and disused IP will retain the PTR record in caches for the length of the TTL (usually 1 day) but as there is no server on it, it’ll never be seen or recorded by anyone.

Another common argument against the policy is with ISPs where they say ‘but our customers might be disconnected and another customer allocated their IP address’

Good network management would result in a customer leaving and the IP addresses going into the ‘unallocated’ pool at the back of the list so until all the addresses have been cycled it is not re-used.  There are no ISPs that we know of that have a turn over of customers on their static IP addresses where all the available space is rotated within 24 hours.

There are many other arguments for and against the policy but we of SORBS have considered each one and can find no valid reasons to have PTR records of 60 seconds except where it comes to abuse, where someone will fake information, do something bad and then change it back.  We therefore mandate that if you want to remove a single IP address from the SORBS DUHL, the PTR record has to have a minimum TTL of 12 hours (43200 seconds.)

Note: Throughout this document we refer to a single IP address, this is because networks of 256 addresses (/24) or larger do not have the same policy as it is likely the ISP is performing a ‘mass update’ of data.  Should anyone request delisting from the SORBS database, the support robot looks at the PTR records for all addresses in the request and will reject the request if the TTLs are lower than 43200 seconds (even if it’s a network request.)   This is because the robot is simple, and is only for sorting the rubbish from the good requests.  Any ISP representative requesting a network delisting would need to reply to the robot response which will result in the request automatically being given to a SORBS administrator (a real human being) who can use a lot more complex logic to analyse and formulate a response to the request.

Comments Off

Dynamic IPs and rDNS

May 22 2010 Published by Michelle Sullivan under DNS

A continual source of complaints by the uneducated is that SORBS is trying to tell people how to run their networks.  Unfortunately there is not much we can say except that they are wrong on so many levels.  Here is some information about why.

First we believe that we don’t want email from networks that can’t follow basic network setup of services.  If you, as a network administrator, cannot be bothered to setup your network in accordance with good/best current practices we can be fairly sure you don’t care about basic network hygiene and we would probably see very legitimate traffic from your site as apposed to the masses of spam and other nasty traffic.

So what does best current practices really mean?

Well first basic setup for a network is getting routed, and if you get that wrong, well you’re not going to get any traffic so that’s not really an issue.

Next you’d give names to everything on your network.  Now this is where the first issue starts.  There are these things called ‘A’ records, ‘MX’ records, and ‘NS’ records (to name the most important few.)

NS records provide the “glue” for DNS (Domain Name System, the service that translates www.yourdomain.example.com into an IP address)  The “glue” tells the system where the authoritive names are for your domain(s) and without it anyone requesting a lookup would not be able to work out where to send the request.

A records provide the name to IP address translation.  Each A record will have a hostname which when combined with the domain name (using NS records) will return IP addresses of computer/server/host/cluster that you want to talk to.  The IP addresses are used for the computers to talk to each other, not the names.

MX records are “Mail eXchanger” records, which means a record which tell your computer where the server is that will receive email for the domain you are trying to email to.

This all seems quite simple, and most people get this right as there are many examples of how to setup your own domain on the internet.  The problem comes with the other records that people ignore such as the ‘PTR’ records (reverse “PoinTeR” records) in most cases PTR records are not required to make things work so people ignore them.  The PTR record though is used by remote machines (such as email servers) to verify who is sending the email and provide a human readable view.  What this means is when a computer has the IP address “1.2.3.4″ (which really means nothing) a mail server when receiving a connection from it will check for a PTR record and if it gets one (eg: mail.example.com) will insert that information into the message so anyone with a filter that says “I want email from example.com servers” will be able to whitelist it.  Seems simple enough so far, however some people also use it to base a complex set of rules and will usually check the PTR records match the A records and visa-versa (to ensure no forgery) and will reject all mail when this fails.  They may also (like AOL) reject all mail from IP addresses that don’t employ PTR records.

Best Current Practices (and the same recommended practice since the early 1990′s) are that every host has an A record and the IP address used in the A record has a matching PTR record.  Of course with later developments of the internet IP address space started to fill up and so you would find that one server might have many A records from differing domains.  RFC1912 was created in 1996 to address common DNS issues and addresses the PTR record from a BCP point of view.  A records do not have to have a matching PTR record but all PTR records MUST have a matching A record (Section 2.1) here is the quote:

Make sure your PTR and A records match. For every IP address, there should be a matching PTR record in the in-addr.arpa domain. If a host is multi-homed, (more than one IP address) make sure that all IP addresses have a corresponding PTR record (not just the first one). Failure to have matching PTR and A records can cause loss of Internet services similar to not being registered in the DNS at all. Also, PTR records must point back to a valid A record, not a alias defined by a CNAME. It is highly recommended that you use some software which automates this checking, or generate your DNS data from a database which automatically creates consistent data.

The PTR issue existing often is the source of support calls to ISPs, and as they want a quite time, they take the lazy option and create PTR records that look like ’1.2.3.4.example.com’ or ’1.2.3.4.dsl.example.com’ and then create a bunch of associated A records so the “forward-reverse’ verification doesn’t fail.  Being the ‘lazy’ option they also make the process as simple as possible, some more advanced admins are familiar with ‘$GENERATE‘ lines some just write scripts to populate the zone files statically.  The ‘$GENERATE‘ lines are not as silly an idea as all think though as they can be overrided.

To use the $GENERATE line you can add the following to a zone file:

$GENERATE 11-254 $ PTR dhcp$.example.com.

This will generate the following records:

11 PTR dhcp11.example.com.

12 PTR dhcp12.example.com.

13 PTR dhcp13.example.com.

253 PTR dhcp253.example.com.

254 PTR dhcp254.example.com.

Note: $GENERATE can be used for both PTR and A records so you can also use them to add the A records to match the PTR records.

So were does SORBS come into all this, well if SORBS see records for 65,000 consecutive addresses we feel that the admins of the network can’t be bothered with that part of their network, or that they feel everything in that area are non-consequential hosts (eg: home users that are browsing the Internet – and therefore not running servers.)  The result is that we can block those hosts as they either aren’t or shouldn’t be running email servers.

We believe all hosts on the Internet that are servers should be registered in the Domain Name System (DNS), we believe that for hosts that have multiple services (eg a web server hosting 100′s of domains) the servers’ hostname should be registered with both A and matching PTR records.  However the web services require the A records, but the PTR records are not mandated so should be skipped.  This is best current practices and everyone who follows these practices never run into issues with SORBS.

If you are an administrator that doesn’t follow the best current practices and has problems with SORBS now you know why…

Now why was the $GENERATE information given, well there are some ISPs who wish to set “default”, a set of PTR records that apply to everyone where the customer has not set their own (yet).  For this the $GENERATE records are ideal but the choice of what goes in them causes issues.  SORBS looks at these records and if we see things like ’127.0.0.1.static.example.com’ we know the administrators of the network statically assign these networks out to their customers.  If we see ’127.0.0.1.dsl.example.com’ we conclude that the administrators have assigned generic DSL lines to that address space.  Generic DSL being the majority which is of course the home user with a web browser never even considering to run a mail server, but often getting infected with viruses and trojans that send spam, consequently we block it.

Is this SORBS telling you how to run your network?  Not really you can run your network however you want, what SORBS is telling you is if you want to run mail servers from your network we will not accept mail whilst you look like the majority of users (the majority being home users.)  On the other hand, if you want to run your network conforming to best current practices we will be happy to accept mail from you.

To try and simplify things (and cut down the 27,000+ different patterns in use in 2005), staff at SORBS created a draft RFC for defining the ‘tokens’ used in Generic PTR records and whilst in the process of submission it was suggested that it should be submitted as a standard rather than an RFC.  For this to happen the RFC was submitted to the DNS Operations working group, and the original draft allowed to expire.  Unfortunately there are a lot more important issues that are ahead of the proposed standard and it is going to be some years before it is progressed.

Comments Off