A (Re)-Introduction to the Terraform vSphere Provider

Since the release of HashiCorp Terraform 0.10, HashiCorp has been working hard to improve the features in our VMware vSphere provider for Terraform. At the start of December, we reached a major milestone for the vSphere provider with the release of version 1.0.

Chris Marchesi

Terraform

Dec 11, 2017

Chris Marchesi

NOTE: This article was originally released shortly after the 1.0 GA release of the Terraform vSphere provider in December of 2017. To ensure the usefulness of the information in the article, it has been updated to reflect the state of the provider as of version 1.3, which was released later in January 2018, and includes important updates to disk management within virtual machine resources. Future updates to the provider will be covered in new articles.

Since the release of HashiCorp Terraform 0.10, HashiCorp has been working hard to improve the features in our VMware vSphere provider for Terraform. Terraform enables organizations to use a consistent approach to provision any infrastructure. VMware is a key component of many organizations' on-premises and private cloud infrastructure. We have made significant effort to modernize the provider, adding resources to manage not just virtual machines, but also data center-level resources and inventory.

At the start of December, we reached a major milestone for the vSphere provider with the release of version 1.0, formalizing many of the major improvements that work to support vSphere's essential networking, storage, and inventory management features. This included a complete rewrite of the vsphere_virtual_machine resource, overcoming many long-standing design hurdles and adding many new features.

We have quickly released version 1.1 as well, correcting several post-release bugs with the vsphere_virtual_machine resource, and have exposed more data via the vsphere_virtual_machine data source to assist with cloning virtual machines from existing virtual machines or templates.

We included many of the release details in the mailing list release announcement, but we wanted to use this blog to demonstrate the power of the provider as it now stands to both new and old users alike. Whether you have just discovered the Terraform vSphere provider, have been using it for some time, or have been waiting for improvements to satisfy a certain use case, we hope you find the improvements helpful!

»Running the Full Example

In this article we will be covering an example for creating virtual machines on a brand new datastore, distributed virtual switch, and port group -- all managed through Terraform. Our example is a three-node cluster, and we will be creating one virtual machine for each host. Each VM will be customized with a working network configuration and should be available on the network once everything is complete. The example is moderately complex, and a little much to fit into the post in its entirety, so we will be referencing the vSphere provider repository itself. To follow along, get the full configuration here.

Follow the directions in the README in that directory for the settings for the configuration, and configure your endpoint and credentials by either adding the relevant options to provider.tf, or by setting them as environment variables. See here for a reference on provider-level configuration values.

Once you are ready, you can check your configuration by running terraform plan, and then terraform apply (or just the latter if you are on Terraform 0.11). Once you have approved the apply, Terraform will create your datastore, distributed virtual switch, port group, and finally, your three virtual machines, each on its own separate host. You should even be able to connect to each machine as well, based on the IP addresses you set.

»Unpacking the Example

Now that we've seen it in action, let's review each of the major parts of the configuration. We'll start with the data sources first before we move on to resources.

»Data Sources

Here are our data sources, pulled from the data_sources.tf file. For brevity, annotations have been removed:

data "vsphere_datacenter" "example_datacenter" {
  name = "${var.datacenter}"
}

data "vsphere_host" "example_hosts" {
  count         = "${length(var.esxi_hosts)}"
  name          = "${var.esxi_hosts[count.index]}"
  datacenter_id = "${data.vsphere_datacenter.example_datacenter.id}"
}

data "vsphere_resource_pool" "example_resource_pool" {
  name          = "${var.resource_pool}"
  datacenter_id = "${data.vsphere_datacenter.example_datacenter.id}"
}

data "vsphere_virtual_machine" "example_template" {
  name          = "${var.template_name}"
  datacenter_id = "${data.vsphere_datacenter.example_datacenter.id}"
}

Our use of data sources here is running under the notion that you likely already have a data center, ESXi hosts, and a cluster or resource pool under which you want to deploy all of the respective resources. Likewise, you also likely have a template that you want to use as the base for creating the VMs.

As such, rather than having to manage these resources within Terraform when there is no need to otherwise, we use the vsphere_datacenter, vsphere_host, vsphere_resource_pool, and vsphere_virtual_machine data sources to pull the necessary information from vSphere to use as inputs to our resources. To find out more about each resource, you can click on the links above to read their documentation from the Terraform vSphere provider reference.

Several other data sources exist—even ones for datastores, distributed virtual switches, and networks (which allows you to look up standard port groups and opaque networks managed by NSX, in addition to DVS networks). This allows you a lot of flexibility in determining the scope of the Terraform configuration you wish to have. To see the full list, check the resource documentation page. Also note the count in the vsphere_host data source. This allows us to consolidate the configuration for all 3 host data sources into the same declaration, saving some repetition.

»Resources

Now let's go over our resources. We will review this in additional detail so that we can explain what each resource does and what it's capable of, especially in the case of the virtual machine resource.

»The NAS datastore resource

We use the vsphere_nas_datastore resource to create an NFS datastore to store our virtual machines in:

resource "vsphere_nas_datastore" "example_datastore" {
  name            = "${var.datastore_name}"
  host_system_ids = ["${data.vsphere_host.example_hosts.*.id}"]

  type         = "NFS"
  remote_hosts = ["${var.nas_host}"]
  remote_path  = "${var.nas_path}"
}

The above example creates an NFS datastore with the name defined in datastore_name. It connects using NFS v3 to the host that we define in the nas_host variable, and the share path defined in nas_path. These two variables would each be counterparts to an NFS path such as nfs1.vsphere.local:/export/nfsds1. We also pass in the output of our multi-count vsphere_host data source, mounting this datastore on all three of our hosts. With the storage complete, we can move on to networking resources.

»The Distributed Virtual Switch resource

Next, we use the vsphere_distributed_virtual_switch resource to create the distributed virtual switch (DVS). This is the bridge between the virtual network and the physical hardware, and is attached to network interfaces on each host:

resource "vsphere_distributed_virtual_switch" "example_dvs" {
  name          = "${var.switch_name}"
  datacenter_id = "${data.vsphere_datacenter.example_datacenter.id}"

  host {
    host_system_id = "${data.vsphere_host.example_hosts.0.id}"
    devices        = ["${var.network_interfaces}"]
  }

  host {
    host_system_id = "${data.vsphere_host.example_hosts.1.id}"
    devices        = ["${var.network_interfaces}"]
  }

  host {
    host_system_id = "${data.vsphere_host.example_hosts.2.id}"
    devices        = ["${var.network_interfaces}"]
  }
}

The following example creates a DVS with the name defined in switch_name, and in the data center from our vsphere_datacenter data source. We add each of our hosts individually this time around, as each host/NIC combination needs to be declared in it own separate host sub-resource. Note that we are using a list of common NICs in each instance (defined by the network_interfaces variable), so in this example, each host has to have the exact same NIC combination free for this DVS.

We are not quite done with setting up our network and we now need to add a port group to this DVS.

»The DVS Port Group resource

To add the port group and complete the network configuration, We use the vsphere_distributed_port_group resource. This resource is invaluable; while you might not need to add a datastore and port group for every application, business unit, or customer, it's quite common to have a VLAN or a port group as a logical unit of separation.

Let's look at our example below:

resource "vsphere_distributed_port_group" "example_port_group" {
  name                            = "${var.port_group_name}"
  distributed_virtual_switch_uuid = "${vsphere_distributed_virtual_switch.example_dvs.id}"

  vlan_id = "${var.port_group_vlan}"
}

Here, we are creating a port group with the name defined in port_group_name. This also becomes our network name (if you ever decide to look it up with the vsphere_network data source). We pass in the UUID of the DVS that's created from our vsphere_distributed_virtual_switch resource, and finally assign a VLAN ID (or just create an untagged network, if the VLAN ID is 0). The configuration here is nice and short. Nothing else is necessary unless you require more advanced settings or overrides.

We are finally ready to create our virtual machines.

####The Virtual Machine resource Finally, here is the vsphere_virtual_machine resource.

resource "vsphere_virtual_machine" "example_virtual_machines" {
  count            = "${length(var.esxi_hosts)}"
  name             = "${var.virtual_machine_name_prefix}${count.index}"
  resource_pool_id = "${data.vsphere_resource_pool.example_resource_pool.id}"
  host_system_id   = "${data.vsphere_host.example_hosts.*.id[count.index]}"
  datastore_id     = "${vsphere_nas_datastore.example_datastore.id}"

  num_cpus = 2
  memory   = 1024
  guest_id = "${data.vsphere_virtual_machine.example_template.guest_id}"

  network_interface {
    network_id   = "${vsphere_distributed_port_group.example_port_group.id}"
    adapter_type = "${data.vsphere_virtual_machine.example_template.network_interface_types[0]}"
  }

  disk {
    label = "disk0"
    size  = "${data.vsphere_virtual_machine.example_template.disks.0.size}"
  }

  clone {
    template_uuid = "${data.vsphere_virtual_machine.example_template.id}"

    customize {
      linux_options {
        host_name = "${var.virtual_machine_name_prefix}${count.index}"
        domain    = "${var.virtual_machine_domain}"
      }

      network_interface {
        ipv4_address = "${cidrhost(var.virtual_machine_network_address, var.virtual_machine_ip_address_start + count.index)}"
        ipv4_netmask = "${element(split("/", var.virtual_machine_network_address), 1)}"
      }

      ipv4_gateway    = "${var.virtual_machine_gateway}"
      dns_suffix_list = ["${var.virtual_machine_domain}"]
      dns_server_list = ["${var.virtual_machine_dns_servers}"]
    }
  }
}

There is a lot here to digest, so we will break it down a bit:

Basic options

First off, let's start with the basic options. We are creating three virtual machines, assuming that the length of the esxi_hosts variable is 3. These VMs are named based on a base name, defined in the virtual_machine_name_prefix variable, with the resource index (accessed through count.index) appended to the end. We locate this in the resource pool fetched by our vsphere_resource_pool data source, on the host specified by the vsphere_host data source at the relevant count index, and place it in the datastore that we created with our vsphere_nas_datastore resource. You can also see the basic CPU and RAM options specified. Each VM will get 2 vCPUs and 1 GB of RAM (defined by the num_cpus and memory options specifically).

Virtual device options

The vsphere_virtual_machine resource allows management of three major virtual device types: network interfaces, defined by network_interface sub-resources, virtual disks, defined by disk sub-resources, and an optional CD-ROM drive, specified by a cdrom sub-resource. The latter is not defined here as our VM does not require a CD-ROM drive.

A network_interface block is straightforward. Each device is created in the order in which they are declared. All that is actually necessary is the network_id attribute, which we load from the id attribute of our created vsphere_distributed_port_group resource. In addition to this though, we also load the interface adapter type from the template loaded by the vsphere_virtual_machine data source, to ensure that the new interface virtual hardware is consistent with the old.

disk sub-resources are a bit more complex, owing to their more sensitive nature. In order to track the lifecycle of each disk, we require a label be assigned to each virtual disk, which acts as a unique name for each device, kind of like a fully-fledged resource. This should follow a simple convention like disk0, disk1, and so on (and in fact, this convention is what is expected when importing a virtual machine resource). We also take the exact size from the template's data source, to ensure that these options are consistent with the source disk.

One option that is not shown here in the disk sub-resource, that is required as well, is the unit_number of a specific disk. This is where the disk shows up on the SCSI bus, and ultimately influences the device order and naming as the guest OS sees it. One disk must always be set to 0, which is the default (think of this as your root disk). If you were to declare a second disk, you would have to include a new, unique unit number:

disk {
  label       = "disk1"
  size        = 100
  unit_number = 1
}

Cloning and customization options

A very useful VM creation workflow is to clone a virtual machine from an existing virtual machine, or from a template, which is a pre-created virtual machine that has been marked as being used for such tasks. This has obvious benefits in saving time and configuration management, and as a pattern fits well with some of our other tools like Packer. Further to this, VMware offers methods to perform common post-clone configuration tasks, namely network configuration. This process is called customization.

While cloning is a popular method of VM creation, it is not the only method, and one of the more important features that was refactored during the rewrite of the vsphere_virtual_machine resource was how it handled cloning. The older resource organized cloning and customization options in a way that was tightly coupled to the baseline configuration, and customization had to be explicitly disabled via a flag named skip_customization.

In the new version of the resource, there is a much more explicit separation between creating a VM from scratch, and cloning from a template. All cloning options are contained within the clone sub-resource, and all customization options in a customize sub-resource within that. This effectively layers the features on top of each other. If you don't include clone, the VM will be created from scratch and not cloned, and if you don't include customize within that, your VM will be cloned, but it won't be customized.

In our example, we clone from the template that is specified in our vsphere_virtual_machine data source. We use the disk size reported back as well, as the disk must be at least the same size when cloning.

Finally, we come to our customization options. This allows us to perform some post-clone configuration of a virtual machine and is guest OS dependent (which is why we set the guest type correctly from the template, set in the guest_id parameter at the top-level of the resource). Here we set the host's name using the same parameters we are using to set the VM and disk names. The domain is set using the virtual_machine_domain variable. Next, we assign IP addressing using a combination of what was supplied in the virtual_machine_network_address and virtual_machine_ip_address_start variables, using these as inputs to the cidrhost interpolation function. This allows us to properly calculate the IP address to use over the range of hosts that are specified, which would allow us to scale this past three hosts if we wanted, provided there are enough IP addresses in the space. Lastly, we specify the gateway and DNS server list via the virtual_machine_gateway and virtual_machine_dns_servers variables, and use the virtual_machine_domain variable again to specify the DNS search domains.

The combination of all of the steps above gives us a completely ready-to-go virtual machine once the cloning is complete. If you have the necessary access to do so, when Terraform is finished you should be able to SSH into the respective virtual machines without issue. This also helps facilitate any post-creation provisioners that need to run as well.

»New vMotion Features

Now that we have finished breaking down the example configuration, let's look at a feature that is completely new to the vSphere provider: support for vMotion.

In our sample configuration, you can see this in action by modifying the host_system_id attribute of our VM resource. Let's try changing this so that all of our servers are on the first host found by our vsphere_host data source. Change this line:

resource "vsphere_virtual_machine" "example_virtual_machines" {
  ...

  host_system_id = "${data.vsphere_host.example_hosts.*.id[count.index]}"
}

To:

resource "vsphere_virtual_machine" "example_virtual_machines" {
  ...

  host_system_id = "${data.vsphere_host.example_hosts.*.id[0]}"
}

(NOTE: data.vsphere_host.example_hosts.0.id works as well.)

After you do this, run another plan (or just apply if on 0.11, or higher), and you will see the plan mentioning that it will be migrating two of the virtual machines:

Terraform will perform the following actions:

  ~ vsphere_virtual_machine.example_virtual_machines[1]
      host_system_id: "host-2" => "host-1"

  ~ vsphere_virtual_machine.example_virtual_machines[2]
      host_system_id: "host-3" => "host-1"

Run the apply, and watch the VMs all migrate to one host. You can then roll back this change by restoring the line to its previous version, and running the apply again. This will put the VMs back on the hosts they were on before.

##Storage vMotion As another exercise, let's try out storage vMotion. Add the following to data_soruces.tf, substituting existingdatastore1 with the name of an existing datastore in your environment:

data "vsphere_datastore" "existing_datastore" {
  name          = "existingdatastore1"
  datacenter_id = "${data.vsphere_datacenter.example_datacenter.id}"
}

Now, in resources.tf, change the following:

resource "vsphere_virtual_machine" "example_virtual_machines" {
  ...

  datastore_id = "${vsphere_nas_datastore.example_datastore.id}"
}

To:

resource "vsphere_virtual_machine" "example_virtual_machines" {
  ...

  datastore_id = "${data.vsphere_datastore.existing_datastore.id}"
}

On your next plan/apply cycle, you will see the following changes:

Terraform will perform the following actions:

  ~ vsphere_virtual_machine.example_virtual_machines[0]
      datastore_id:        "datastore-2" => "datastore-1"
      disk.0.datastore_id: "datastore-2" => "datastore-1"

  ~ vsphere_virtual_machine.example_virtual_machines[1]
      datastore_id:        "datastore-2" => "datastore-1"
      disk.0.datastore_id: "datastore-2" => "datastore-1"

  ~ vsphere_virtual_machine.example_virtual_machines[2]
      datastore_id:        "datastore-2" => "datastore-1"
      disk.0.datastore_id: "datastore-2" => "datastore-1"

Once you approve and apply these changes, the migration of these virtual machines from your created datastore to the existing datastore will begin.

Again, you can revert these changes to restore the virtual machines back to their original datastores.

##Tearing it all down Now that we are all done with the demonstration, let's remove the resources we have created. Run:

terraform destroy

You should see something like:

Terraform will perform the following actions:

  - vsphere_distributed_port_group.example_port_group

  - vsphere_distributed_virtual_switch.example_dvs

  - vsphere_nas_datastore.example_datastore

  - vsphere_virtual_machine.example_virtual_machines[0]

  - vsphere_virtual_machine.example_virtual_machines[1]

  - vsphere_virtual_machine.example_virtual_machines[2]

Confirm the destroy and Terraform will remove the VMs, datastore, port group, and DVS, freeing all of these resources.

»Coming Soon

Finally, we'd like to mention some of the plans for the immediate future of the vSphere provider:

We will be working on adding support for compute and cluster resources within the provider, such as compute clusters, standalone hosts, and datastore clusters.
Subsequently, DRS capabilities will be added to the VM resource, allowing for automation of VM migration guided by current resource usage.
OVF and vApp support will be coming to the VM resource as well, facilitating the ability to either import OVF templates, or clone from an existing imported OVF template with the correct configuration set.

»Conclusion

We hope you have found this demonstration of the current capabilities of the vSphere provider useful. For the full details on provider usage, be sure to check the provider documentation. For more information about how to submit bug reports, feature requests, or make your own contributions to the provider, see the vSphere provider project page.

Running the Full Example
Unpacking the Example
Resources
New vMotion Features
Coming Soon
Conclusion