With all respect to Sam Roberts, I’ve been wondering for the last little while where all the good application fingerprinters have gone to. There used to be a lot of programs (good or bad) that tried to fingerprint what application was running on which remote service. I’m not just talking about software that attempted to figure out what protocol was running on what port, but what application was running on each (i.e. Apache vs IIS vs Tomcat etc).
Historically, there are a lot of programs that attempted to do this. They weren’t necessarily very good, but at least they attempted to figure out what was running where. Some examples are the smtp scanner SMTPScan, TelnetFP, fpdns and HTTPPrint. There are many more examples out there, such as Jeremiah Grossman’s HTTP Fingerprinter released at BlackHat Asia 2002, but most of them never got to a point where they were good enough to be considered supported/usable. (As a side note, fpdns is actually very good). Now, each of these programs had their own flaws and none of them were that accurate, but it gave members of the security and administration comunity a good tool to use when they needed to know what is running where.
Probably the fingerprinter that is used most often, and the tool that probably killed active fingerprinting development, is Nmap. Nmap is a decent fingerprinting tool that uses a global database mostly submitted by their user base to to acccompany their popular port scanning software. As a tool, it makes sense to pair the two together (especially with OS fingerprinting). The problem with it is that it isn’t really that accurate. One of the biggest problems (and advantages) with nmap is that most of the fingerprints are submitted by the user community. It is up to the nmap admins to decipher whether or not the fingerprints submitted are good or bad. If you don’t have access to the software, you can only presume that a fingerprint submitted by a third party is accurate. As an example, assume that a user submits a fingerprint and says it represents a fingerprint for the Apache web server. However, what happens if it is actually a web server running IIS with a modified banner that the submitter does not know has been changed. If the fingerprint gets included in the next fingerprinting database, there is a distinct possibility that it will taint all future results. Now, I’m not saying that nmap is not a good fingerprinting tool to use if you want to know what is running on your network. I’ve used it many times to try to figure out quickly what is running where. However, I am on the fence as to whether taking user submitted information and including it is a good idea. For what it is and who it is targeted for, it is a decent tool to use. I just think that a better job could be done.
So, what is wrong with the other tools that are out there? Now, admittedly, most of them haven’t been updated for years and there is almost no way they could be accurate anymore. However, they mostly all have the same flaws. One of the biggest problems they have is that, for the most part, they build fingerprints based on set status codes/responses. So, as an example, smtpscan relied on sending 10 sendcases and recorded all the status codes that each SMTP server returned and built a fingerprint on that. The problem with doing this is that it is very rare that a server is configured in such a way that it will respond correctly to the same 10 responses every time and in every situation. Everything is configurable. This would mean that for every minute configuration change that is seen, a new fingerprint would have to be generated. Therefore, for some of the more popular services out there (e.g Sendmail), there could be hundreds or thousands of fingerprints for each permutation known or discovered. This becomes unmanageable.
So what makes a good fingerprinting tool? Personally, I don’t know if there is a way to make a really good one. However, a good one should have at least two things:
1) Neither all for one or one for all. I think the most accurate way to determine what something is running is not to limit yourself to one send/check clause and use that as the be-all and end-all to determine what something is running. The best solution is to use many different sendcases, but only choose what the next sendcase is after getting the result. Therefore, there is no limit to the number of checks that you can use to fingerprint a server. However, after each check, you slowly eliminate the servers it can be until you get to one and/or none. It eliminates the reliance on just one check or on expecting ALL checks to match the way your expect.
2) Guess, but don’t always. There are a lot of tools that either attempt to guess what the service and/or don’t. There are pros and cons to doing each. However, I think the best thing to do is to guess when you are fairly confident that you are within a certain range of values. It’s not entirely useful to find out that your web server is either CERN and/or MathOpd. What does that mean? There has to be some way to differentiate between something so different. Now if you can’t tell between Apache 1.3.12 and Apache 1.3.13, well, at this point it doesn’t really matter. You have it narrowed down so well. Saying that it is Apache 1.3.12 - 1.3.13 is perfectly fine.
Funny enough, I will be speaking on the subject at SecTor, a security conference in Toronto on November 21st. I’ll be speaking on Modern Trends in Network Fingerprinting with a co-worker of mine and I’ll be speaking on more of the above. We’ll also be releasing an HTTP Fingerprinter that I hope is better than most of the ones that are out there. It has a lot of restrictions placed on it for the purpose of the conference, most of which violate what I mention above, but hopefully it rekindles drive for better fingerprinters. I personally believe that this is a skillset that is slowly being ignored, even though it can be really important.