How to build a virtual pentest lab

In this article:

How to build a virtual pentest lab
- Hardware
  - Prerequisites
  - The Wolf’s choice
- Software

Standalone virtual machines are both a cheaper and more practical solution to test systems as they doesn’t need to dedicate hardware and are easier to handle than physical installation (actions such as cloning, doing a snapshot or a rollback become trivial).

Network virtualization goes a step further and apply the same system to a whole network, including workstations, servers, and all networking devices such as switches, routers and firewalls. A virtual network can be of any size and topology, and can mimic any real-life situation such as Active Directory domains, remote-access or site-to-site VPNs or test protocols of every network plane.

Such virtual network can be either fully isolated or have one or several link to physical devices and networks, its all up to you to decide.

Hardware

Prerequisites

The goal of a virtual lab is to be able to quickly setup the environment which will allow you to test whatever you would like to test.

If you have to use it on a regular basis, investing on a dedicated machine is, I think, a must-have. Indeed, the “I must shutdown my browser to free memory to start my VM” usually followed by “I must hibernate the VM to reopen my browser to do some search” is really not sustainable on the long run.

This machine however doesn’t require to be anything extravagant or expensive. However, there are some prerequisites that you will need to fulfill in order to have a comfortable, and therefore usable virtual lab.

RAM: This is the first, main and single really mandatory criteria. If you want to start two virtual machines configured to have 2 GB of RAM each, your hardware must have at least 6 GB of physical RAM (4 GB for the virtual machines, plus 2 GB for the host OS, don’t forget it).

If your hardware don’t have the sufficient amount of RAM, this will just not work. Yes, you may modify the virtual machines settings to lower their RAM requirement, but unless you really know what you are doing you will most likely end-up facing unexpected and buggy behaviors.

If you want to have a comfortable virtual lab, 8 GB of RAM is the bare minimum, but I strongly recommend to aim 16 GB. Note that a motherboard has a limitation on the amount of RAM it can handle, check it before buying anything.
Disk space: A virtual lab stores archived installation medias, virtual machines disk images, plus their snapshots and backup copies. All this consumes a lot of space. Hopefully, not only it is easy to add or replace a hard disk (no motherboard limitation like the amount of RAM), but above all if your virtual lab gets low on hard disk space nothing prevents you from using some network storage to store some or all of your data on an external device.

Note

Unlike RAM, on test systems such as the one we are building disk space is usually not provisioned. This means that while a virtual machine configured to 2 GB of RAM will indeed allocate 2 GB of RAM, a virtual machine configured to 40 GB of disk space will only allocate the disk space it really uses. The 40 GB only acts as an upper-bound limit that the virtual machine will not exceed, in most cases its used space will be far inferior.

A few hundreds GB of disk space is the bare minimum, a terabyte should be sufficient for most personal needs, no need of SSD (storage speed is useless here, invest your money in more RAM or storage space instead).
CPU: Obviously the CPU must support hardware-assisted virtualization instructions, but this is quite a common technology now (it appeared in 2006) and most CPUs have them (with the notable exception of low-consumption and RISC-based CPUs: don’t expect to build a good virtual lab on top of a Raspberry Pi!).

Apart from that, the CPU will only define the speed of and the amount of parallelism in the execution of your virtual machines: a quicker CPU with more cores allows more virtual machines to process data simultaneously.

For personal use, there is no need for anything really powerful as you rarely require true parallelism, and speed is something nice but not required here. This may change however if there is several people using the same virtualization hardware simultaneously, or if you prepare yourself for a CCNE certification and need to virtualize a large network with dozens of devices handling a constant flow of network packets in every direction simultaneously.

The Wolf’s choice

Personally, I went on a second-hand Mac mini:

A Mac mini has all you may want from a personal virtualization system in a very small and quiet enclosure:

Supports up to 16 GB RAM (and I warmly recommend to install this amount of RAM).
A SSD disk… to feed your old laptop. This is the only negative point of this machine (this and the fact it has only a single network interface) as you will need to go on iFixIt to find the required guides and tools to disassemble it and change the hard-disk (screw you, Apple!). But on the other-side, the Mac mini supports up to two 2.5” hard-drives (even the standard edition, see iFixIt related pages, don’t spend any money for the so-called “server” version!). Mine has a 1 TB drive.
Intel Core i5 or i7 CPUs. Those coming with the i7 CPU are noticeably more expensive than the i5 ones with no real benefits for a personal usage. Personally I took one with a i5 CPU.
At last but not least, being a genuine Mac device, you can technically and legally run Apple’s operating systems in your lab.

For moderately higher needs (CCNE or small teams for instance), I saw several blogs of people using a cluster of two Core i7 Mac mini connected to a NAS box and were very happy with the result.

Software

Operating system

I researched and tried several solutions, I think it may be useful to share my feedback here about each one of them.

VMware ESXi

The ESXi is a “free” (as in free-beer, see Promox below) software based on a customized Linux system It embeds only the drivers matching the supported systems… and the Mac mini is not one of them, although some other Apple systems are supported in order to run their operating system.

The main consequence of this is that at the time I was testing ESXi systems on my lab, the driver of the network adapter was missing (ESXi 5). You normally had to go through complex manual manipulations to install it manually, but William Lam shared modified ISO files downloadable from his blog. Once installed, the ESXi worked fine on the Mac mini (even if I was not very happy with the concept of using installation medias downloaded from a blog instead of official sources).

Going back on his blog as I write this article, it seems that things evolved in the good direction since then and that ESXi 6 now embeds the missing drivers. This doesn’t make however Mac mini a supported platform, but this solves the update and upgrade issues plaguing the previous version ESXi on this system as long as these drivers are available (not supported means that VMware will most likely not invest any specific effort to solve an issue related to these drivers and may remove them without prior notice if they cause any trouble).

But beyond the update issues, the real limitation of ESXi comes from its closed-source nature. While you can tinker with it (as shown by the driver issue and the modified ISO file), things quickly become unnecessary complicated and hardly documented. As with closed-source software, you are supposed to use the product a certain way and you depend on the vendor for everything.

All-in-all someone like me quickly finds such system way too cramped to be comfortable.

Alpine Linux

Alpine Linux is a security oriented Linux distribution targeting embedded systems. Both its lightness and security properties make it a good choice for a “small, simple and secure” (as the project presents itself) shell to administrate a virtual machines server.

However, when I tried it UEFI was not natively supported by this system: the installation process goes fine but no way to start the system from the hard-disk once installed.

When I say “no way” I’m probably lying a bit since there were already indeed some resources and ongoing work in this area, but this did not smell good at that time and was still more looking as some kind of rabbit hole than reliable, step-by-step solution.

Nevertheless, I like the concept in ESXi of a minimalistic Linux used as backend to manage and monitor virtual machines, and I still believe that a system such as Alpine Linux would be the ideal fit to do the same thing, but better. The fact that it was not ready at that time does not mean that I will not come back at it again in the future.

Debian

Well… it just works, what else to say? Plug the installation disc, install, reboot, and apt-get your favorite software!

This is what I’m using now.

Proxmox

Proxmox is a Debian derivative (with Ubuntu’s kernel). It is the free (as in free-speech and ) alternative to ESXi, so if you need some kind of drop-in replacement there you have it.

In my case I do not need a web interface, am satisfied with a command-line interface, and have a few other requirements (like using GNS3 or emulating legacy systems) that would make Proxmox features overkill and unnecessary.

Host virtualization software

Xen

Xen is a bare-metal hypervisor, meaning it interacts directly with the hardware without going through an intermediary operating system (even the management domain, Dom0, is technically just a guest system with more privileges).

While there already was some work ongoing to enable UEFI support in Xen, on my side I was not able to setup a reliable system:

By default Grub starts Xen through multiboot. At this stage Xen has no access anymore to some EFI parameters, resulting in the detection of only one processor core and a malfunctoning ACPI systematically freezing any virtual machine (both DomU and Dom0) when attempting to shutdown or restart it. .
In theory Xen should be able to start through Grub’s chainloader, in practice I was not able to make it start this way.
I was however able to start Xen directly from the EFI using EFI bootloader, all CPU cores were now detected, but the ACPI issue was still present.

KVM

KVM stands for Kernel-based Virtual Machine, and as the name implies while Xen is a bare-machine virtualization software this one relies on a full-fledged Linux kernel.

The major advantage of this is that KVM doesn’t need to handle direct interaction with the hardware: all this is left to the Linux kernel, which in turn can rely on standard modules (drivers).

While trying to get Xen running on certain platforms can quickly become cumbersome (similarly to the ESXi we saw earlier, which is also a micro-kernel based bare-metal hypervisor such as Xen), where a Linux can run KVM can run too.

Xen vs. KVM

Performance-wise, a 2011 study shows that KVM being implemented low-enough in the kernel stack, it presents similar if not better performances than Xen as an hardware virtualization software (ie. emulating the computer hardware in order to start any kind of operating system).

Where Xen shines is in its initial and core functionality: paravirtualization (run a modified operating system which communicates directly with the virtualization software instead of communicating with emulated computer devices).

If you want to build a EC2 Cloud with tons of Linux instances in the most efficient way (or, in broader terms, your environment will be generally composed of paravirtualized VMs and virtualized hardware will be the exception), then Xen may be your best fit.
On the contrary if hardware virtualization will be the rule and paravirtualization (supported by KVM through the virtio system, but slower than Xen) the exception, then KVM will be a both easier and more efficient solution.

VMware Player

For now, keeping a VMware Player at hand remains handy. I know this is closed-source and closed-source is evil and all, but there are some situations where having it available might save your day:

It is still a de-facto standard to share and run virtual machines. While you can still convert a VMware Player virtual machine into another format, sometimes you want to reduce Murpy’s Law to the minimum and cannot accept to loose any time (some closed-source applications even specially check that they are running in a VMware virtual machine and not another VM to enforce their supported platforms conditions).

In these occasions, if a provider gives you a virtual machine tested on VMware Player with a strong recommendation to use the same software on your side, it is better to not be extremist and just do it this way. When you have more time later, you can check the provider’s support forum and read the issues encountered by other people who tried the hard way.
VMware has still a noticeable advance compared to Qemu/KVM on some necessary features such as some pass-through functionalities allowing to directly use some host’s physical devices from within the guest and related things.

A typical example is connecting a physical USB key to a guest system: the feature exists in Qemu but no way to get it working, I suspect it is broken. On the other side this is a trivial operation on VMware Player.

Some other potential areas of concern are:
- The support of 3D hardware acceleration in the guest (there is some work ongoing on Qemu side about this, be sure to use the GTK display interface to benefit from it, but I don’t know how advanced it is compared to VMware Player).
- Nested virtualization, allowing to benefit from hardware-assisted virtualization from within the guest. VMware supports it for some times now (even-though I’m not sure if they put it in their free Player too or reserved it to their paid Workstation software), it is not yet supported by Qemu but IIRC I saw a mention of this feature added in some recent Linux kernel change so it is coming.

Oracle VM VirtualBox

This software is not completely free (some parts are available free of charge for evaluation and personal use only), is not really a standard, and has no special or unique feature compared to its alternatives.

Is there any reason to use it? Tell me because I don’t see any.

Network virtualization software

Once you have made up your mind about host virtualization software, time to think bigger and to emulate a whole network in your small box.

Here also, several software packages are available, with their advantages and disadvantages. As long as they offer interconnection with external networks, it should be possible to make different software to interoperate (each one sees the other as an external network). In some cases, this may make sense to combine features which can only be found in two different network virtualization applications.

Note

Some network certification trainings provide a “network simulator” software.

The functionalities of such software is usually very limited, to the point that you often cannot even edit the network topology but are restricted to the topologies proposed by the software author, with the more interesting ones available in paid add-ons.

VIRL

VIRL is the Cisco proprietary and paid network virtualization software. It relies on ported virtual versions of the Cisco devices. To say it again: this software does not emulate Cisco devices, instead devices code has been ported to run natively on the host as VIRL modules. So, instead of running a real IOS and ASA for instance, it will run IOSv and ASAv. This provides far better performances, but may react differently or offer different options than real gear.

As it is a paid software and I did not need its features, I did not try it and therefore cannot say a lot about it.

GNS3

GNS3 it the free alternative to Cisco’s VRL. While initially a graphical frontend and continuation of Dynamips, a free Cisco devices hardware emulator allowing to execute IOS firmware images, it evolved now as a de-facto standard in network simulation in the open-source community, with the support of the appliances from a large number of free and commercial providers and a market place offering both software and learning material.

Note

The fact that GNS3 is a free software and supports the emulation of proprietary appliances and devices does not make these solutions free.

Support means that they has been tested to work and that if you encounter an issue you can raise a ticket to the GNS3 development team. If you want to actually include such devices and appliances in you topology, you will need to provide the path to either a firmware image or an installation media not provided with GNS3 or its appliances.

While a technical tool, GNS3 is graphical making it easy enough to learn while proposing advanced features (including the support of clusters to spread large network topologies on several computers) making it also suitable for large projects.

Having passed some time on this tool, I will soon write a set of articles describing in details how to use it to build virtualized networks focused on IT security testing.

Unetlab / EVE-NG

Unetlab (Unified Networking Lab), which has been recently refactored as EVE-NG (Emulated Virtual Environment - Next Generation) by its creator, can be seen as a GNS3 alternative targeting team-working.

At its core it provides similar functionalities as GNS3: network virtualization involving Cisco devices and hardware-virtualized hosts. Nevertheless, instead of providing a standalone graphical window here everything can be done through a shared web interface (no need of heavy client), in a multiuser user environment, with various sharing and export/import features.

For a single user, this seems overkill to me, but for team-working this is certainly something I would try.

Hynesim

Hynesim is a network simulator specially designed for IT security training. It was initially developed for the DGA (Direction Générale de l’Armement, the French Government Defense procurement and technology agency) and seems to target team-working with some ACLs features.

While the source-code is released under the GNU licence, it is not completely “free” in the the spirit as the downloadable source-code code is more than a year old now and a few versions behind the version available for paid customers, and access to the complete documentation is also restricted to paid customers. Moreover, most of the website being written in French, it may not be very accessible to foreign people.

Despite its release and documentation issues, this project may still remain interesting due to its primary focus on security testing. A quick glance over the few available documentation and the source code seems to indicate that it supports WiFi (but I don’t know what exactly is supported here) and Dynamips (emulation of Cisco devices). An old paper from 2008 announcing the project creation also mentioned Bluetooth, but this seem to have been abandoned.

The project team is also developing complementary utilities such as Action Manager, an automation tool apparently designed to simulate end-user activity within a guest (quoting the roadmap page: “mail related, web browsing related, text writer related…”), but it is not released yet.

Mininet-WiFi

Mininet-WiFi is a fork of Mininet adding WiFi capability to the latter.

Mininet is a Linux network virtualization software. It takes advantage of Linux kernel features to emulate potentially large Linux networks while staying very low on resources and offering better performances. The documentation mentions running hundreds of guests and switches on a single host and 2 Gbps total bandwidth on modest hardware.

However, this is OS-level virtualization, meaning all nodes will actually be Linux systems and all will share the same host’s kernel. Nevertheless, GNS3 published an article describing how to interoperate Mininet and GNS3 to get the best of both worlds.

At last, Mininet-WiFi comes on top of this and, thanks to a virtual WiFi driver, adds the WiFi network simulation capability.

The features announced in its documentation seem pretty promising:

It takes into account distance from the access point, signal attribute, overlapping, interferences and nodes movement.
It handles authentication, from WEP to WPA2 and including RADIUS (all this using standard Linux stack: the virtualization part being handled by the driver).
It seems that the virtual device also supports monitor mode.

All this should be enough to reproduce most WiFi-related security scenarios (as long as all involved devices are or can be simulated with Linux systems).

I will certainly deep further into this one as soon as I have the occasion.

Others

Brian Linkleter maintains a good list of open-source network simulation software.

In addition to the most well-known and general purpose projects I already mentioned here, you will find other ones which are either more confidential or address very specific uses cases (for instance the Shadow project allows to simulate large Internet-scale peer-to-peer infrastructures such as the Tor or Bitcoin networks).

WhiteWinterWolf Practical IT security, *nix systems & networking