Footprinting and Port Scanning

Overview

Footprinting refers to the gathering of information about an organization's computer systems, from the outside of those systems looking in. This is done by attackers in order to look for weaknesses in a system, but also should be done by system administrators so they can make sure information isn't accessible when it shouldn't be.

Footprinting can be done in both technical and more "physical" ways. We'll focus on technical footprinting here, but it can also include strategies like:

"Shoulder surfing": looking over someone's shoulder as they enter passwords, PIN numbers, etc.
"Dumpster diving": searching through trash or recycling for sensitive information.
"Piggybacking": following behind someone through a locked door to gain entry to a building.

Public Information

Part of footprinting is gathering public information that can be readily accessed through an organization's website or similar:

Organization contact names, emails, and phone numbers
Published privacy or security policies
News articles about the organization

We also can gather information about systems via whois lookups. The whois protocol is used for querying domain and IP address ownership and other information. It can also be accessed through websites such as https://www.whois.com/whois. This can give information which we can use to further search around.

Websites also provide information on systems being run. The source code of sites can provide information on technology being used to build and host them. For example, https://builtwith.com/ shows server info, libraries, etc. harvested from scanning site source code.

This kind of information can be used by attackers to look for systems built using technology with known vulnerabilities.

Port Scanning

Networked systems accept connections using ports, which are used by the transport layer to match packets being sent to a system to running applications. Port scanning is the systematic searching for open ports on a system. If a port is open, it indicates that an application is running on that port, and we can exchange data with it.

If we find a port, we would then want to know what application is running on it. If it is on a well-known port number (such as 22 for SSH, 443 for HTTPS, etc.) then we can be pretty sure what's running. If it isn't, then we can try sending it different payloads and see how it responds. For instance, we can send a server an HHTP request and see if it responds using the HTTP protocol.

Port scanning can reveal running programs that the owner of a system may not even be aware are running. For example, Apache and other web servers can be configured to serve multiple sites on different ports. It's possible for Apache to serve secondary, old, or test sites on a different port from the main site. It's also possible a server has SSH, FTP, telnet or something else running.

A port scan essentially involves sending some type of packet to a system on a port. We can target specific ports or just try a bunch. When we do this, one of three things can happen:

We get an accept response back, indicating that the port has an application listening on it.
We get a closed response, indicating that there is no application listening on it.
We get no response at all, which tells us either our packet never reached the machine (likely filtered out by a firewall), or that the machine chose not to send a response back.

Port Scanning Details

The nmap command is a "network mapping" tool, which can do a variety of things including port scanning. It has been actively developed since 1997 and is the most widely used port scanner.

It can do a variety of different scans and reports the state of the port as it sees it:

Open: an application is accepting TCP connections or UDP packets.
Closed: the port is accessible but does not have an application actively listening on it
Filtered: the port is unreachable due to packet filtering (e.g. firewalls, routers, host security) and may be open or closed.
Unfiltered: the port is accessible but Nmap can't tell if it's open or closed. This is only used by ACK scans.
Open|Filtered: the port is either open or filtered, but nmap can't say which. Some scans don't return a response if the port is open, but the query may also have been filtered. Used by UDP, IP protocol, FIN, NULL, and Xmas scans.
Closed|Filtered: the port is either closed or filtered, but nmap can't say which. Only used by IP ID Idle scans.

TCP Handshakes

When first connecting to a server using TCP, a client initiates the connection with a "3-way handshake":

The client sends a SYN packet to the server
The server responds with a SYN-ACK packet.
The client then sends back an ACK packet completing the handshake.

To close a connection, there is a closing 3-way handshake as well:

The client sends a FIN packet to the server.
The server sends an ACK packet, followed by a FIN packet back to the client.
The client sends an ACK packet to the server.

TCP also has reset RST packets which indicate that the connection is closed with no handshaking.

Scan Types

The default scan is a SYN scan. It involves sending a SYN packet to the host, and waiting to see if we receive the SYN-ACK packet back. If we do, then the port is open. In this case, we will send a RST packet instead of an ACK. This closes the connection before the handshake finishes. Because of this, the communication may not be included in application logs as a connection (like HTTP or SSH logs) because the communication never reaches the application layer.

If we receive a RST packet, that means the host is reachable but the port is not open. If we receive no response then the host is filtered.

This scan must be done with root access since it involves manipulating raw packets.
The Connect scan is similar to the SYN scan, except it completes the 3-way handshake with the host. Because of that, it is more likely to be included in application logs, and also takes longer since there's more communication. However it uses a normal TCP connection and so does not need to be run as root. Also, the traffic looks more normal, so it's less likely to be blocked by firewalls. (The premature RST packets sent by SYN scans can be ignored in some cases.
An ACK scan sends an acknowledgement packet to a host, which it won't be expecting. A system following TCP correctly should send back an RST packet if the port is open and ignore it otherwise. In practice hosts can do whatever they want in these cases.
Flag scans are TCP packets sent with specific flags set in the packet header. They all will be unexpected for a host to receive. TCP says hosts should send back a RST packet for closed ports and not answer for open ports. In practice one or more of these may yield different behavior on different systems. There are different ones nmap supports with different specific flags set:
1. Null: no headers set at all
2. FIN: just the FIN header set
3. Xmas: lots of headers set. So called because the packet headers are "lit up like an Xmas tree".
Both of the above scans involve TCP handshake packets and make sense for scanning for TCP applications. We can also scan for UDP applications, but it's a little more scattershot because UDP does not have any connection initiation sequence like TCP does. A UDP scan sends an empty UDP packet to the server. The application can choose to do whatever it wants to with this. Some applications will respond, telling us the port is open. Some will simply ignore it which nmap will list as "open|filtered" because it has no way to know.

nmap also has the capability to send UDP packets for specific applications such as DNS and DHCP. These are fake but legit looking query packets which will be more likely to elicit a response from the server, instead of being ignored.

Host Discovery

The nmap tool can also be used to find IP addresses which are reachable on the network. To do this, we don't want to connect on specific ports as we do with port scanning, because a host might be reachable just not on the port we tried. There are a variety of ways this can be done, but usually a range of IP addresses are tested, and the ones which are reachable are listed out.

On a local network, nmap will use ARP requests. These work at the network access layer and so can't be filtered with a firewall.

For machines not on our LAN, we can send ICMP requests. The ICMP protocol (Internet Control Message Protocol) is an auxiliary protocol to IP. It includes the ability to "echo" a host, which asks it to just send a response to you. This is used to implement the ping command and also by nmap in order to see if hosts are reachable. Such a scan is called a "ping scan".

ICMP also includes a "timestamp" message, which is used to do time synchronization between hosts. nmap can also do a scan using timestamp messages instead.

OS Detection

We can make educated guesses of which operating system a host is running based on the way it behaves on a network. We do this by sending it packets and observing how it responds. The network layers are implemented in the operating system, and have different behaviors in some cases.

For example, IP addresses have a "time to live" (TTL) value which gives the number of hops a packet can take before it is declared lost and discarded. Linux typically starts packets at a TTL of 64, while Windows typically starts at 128. Unless the administrator has changed this value (which is unlikely) we can use it to infer what kind of host we are dealing with.

TCP also specifies that a "window size" must be communicated between hosts. The window size is the amount of data which can be sent before an ACK is required. Different operating systems use different default values and scale the value differently.

nmap has the capacity to guess at host OS identity based on these and many other factors.

Best Practices

You should only ever scan machines that you personally administer
You can use it to verify what ports you have open, and close those which are not being used