In this article I am going to briefly cover a microsegmentation approach, using NSX-T and vRealize Log Insight. Be aware that is just a basic microsegmentation and it is meant to demonstrate the usability of Log Insight in helping you to build up an Infrastructure related rule base. Designing environmental or application related rules is a whole different topic and will not be covered here.
Bill of Materials (BOM) consists of NSX-T 3.1.1 and vRealize Log Insight 8.3.
vRealize Log Insight (vRLI)
vRealize Log Insight is a log collector and analytics tool, that helps you preserve your logs and gain better visibility of what is going on in your environment. In this case we are going to use it to monitor specific firewall rules, that are meant to capture all the packets that do not match any other firewall rule.
By the way, the Log Insight comes together with NSX-T under the same licence, so you have no excuses for not using it :)
NSX-T Distributed Firewall (DFW)
On the other hand, is NSX-T with its distributed firewall.
DFW works in a way where a specific function on the IOChain intercepts the VM traffic and sends it to a module in the esxi's kernel, which module in turn enforces the distributed firewall rules. As a result, from that implementation, you get a firewall rule set applied on every single vNIC that is connected to a NSX prepared virtual switch.
Being a stateful firewall, the DFW collects related packets until the connection state can be determined, and then it first evaluate the connection tracker table for a matching session. If such session is found, the traffic is allowed to proceed. However, if there is no matching session the flow is evaluated against the rule set on a first match basis. This means that reading the rule set for a virtual interface from top to bottom, the first rule that matches will be the one used by the firewall. If the matching rule allows the traffic, it will put the session flow in the conntrack table where it will remain until the session timer expires or the session is terminated.
Blacklisting vs Whitelisting
There are two different approaches to firewalling your environment - Blacklisting model and Whitelisting model.
The blacklisting model is when you create DENY rules to block specific type of traffic and everything that does not match these DENY rules will be allowed (Default Rule - ALLOW). The main advantage of the blacklisting model is its simplicity.
The whitelisting model is based on the zero trust principle, which essentially denies everything that is not explicitly allowed (Default Rule - DENY).
According to VMware, microsegmentation is a network security technique that enables security architects to logically divide the data center into distinct security segments down to the individual workload level, and then define security controls and deliver services for each unique segment.
Here we are focusing on the defining the security controls rather than the network segmentation.
Enough theory, let's get now to the actual work.
Using my home NSX-T lab I have configured an Infrastructure section with very few rules, that I am going to use as a starting point:
On the above screenshot, you might have noticed, there are 2 unusual rules at the bottom of the Infra section. Their role is to catch all traffic that does not match any of the rules above, and thus help me to build the necessary rule base, before I can switch my Default rule to DENY and achieve zero trust.
The "Catchall-Outbound" rule has as a source "Infra-All" aggregation group, that contains all Infra related groups, ie. all IP addresses of my infrastructure servers, and the destination is set to ANY. It is meant to capture the traffic that egresses from the Infra servers. That rule has a log label set to "Infra-Outbound":
The Catchall-Inbound" rule has a similar configuration, where the only difference is the direction of the traffic - ANY to INFRA, so all the ingress traffic. It also has an "Infra-Inbound" log label:
The catchall rules, in my example, are focusing on the Infrastructure section, but you can reuse the same approach for any firewall section.
Setting up a Dashboard in Log Insight
Assuming there is a preinstalled Log Insight instance, that is already integrated with your vCenter and ESXi hosts, and also has the NSX-T content pack installed, the next step would be to setup NSX bits to forward their logs to it.
That can be done manually, by configuring syslog server in the cli of each component (set logging-server), or globally by going to System / Fabric / Profiles / All NSX Nodes.
Create Log Insight Dashboard
Now, as we have vSphere and NSX-T forwarding logs to the Log Insight instance, it is time to create dashboards to monitor the Catch All rules.
Open the vRLI web interface and navigate to Interactive Analytics. Once there, search for one of the previously created log labels:
I am getting some results, which means there is some traffic that did not match any of the defined Infra rules, therefore it has been captured by my special rules.
To create a dashboard, from that search, filter by Non-time series and group by vmx_nsxt_firewall_dst_ip_port (VMware - NSX-T):
After hitting Apply I see some results, on the graphic above, so the next step is to save that search to a dashboard.
Click to the 3rd icon from the right, Add current query to dashboard:
And then Add to create your new dashboard:
Repeat the same procedure for all the log labels that you are monitoring for. That is the result in my case:
Take a look at the above dashboards. What you will see there is the majority of the traffic, that does not match any pre-created Infra rule, is an egress traffic. The ingress one is neglectable.
To building up my Infra section rule base, I will start with the top polluter from the graphic above. There are 1363 packets that have been sent to IP 22.214.171.124 on port 80. That's a public IP and I am not quite sure what is behind it, therefore I do not know yet if I need to create a matching Allow rule or not.
Click on the top polluter bar and select Interactive Analytics, which brings us to the analytics page filtered by the destination ip/port combination only:
On the analytics page, I can see two different sources - 172.16.35.100 and 172.16.35.101, which are actually test linux vms. Quick lookup of the destination IP (126.96.36.199) shows it is a repository for my linux distro. That actually makes sense to me, as I did run an upgrade on my test vms just to generate some traffic for the demo.
I definitely would like to keep updating my linux machines, so I am going to create Any to Linux Upgrade rule, where the destination will be the full list of official repositories. However, if there is a traffic, that you do not want to be allowed, there is no need to explicitly create a Deny rule. It will be dropped anyway, once you get to the point where you feel comfortable with your rule base and actually do switch the Default rule to Deny.
Keep monitoring the dashboards, examine the logged traffic and create allow rules where required. Once happy with the results (ie. the dashboards are displaying only traffic that has to be blocked), you simply set these catch all rules to Deny. On a later point, when the rest of the environment is firewalled, switch the default rule to Deny and remove these catchall rules.
Thanks for reading!