Typical error message follows:
Error: Unreadable module directory
Unable to evaluate directory symlink: lstat modules/backend: no such file or directory
Terraform cannot find the path for the backend in use.
Create a symbolic link to the backend module directory inside the modules
directory:
cd modules
ln -sfn ../backend_modules/<BACKEND> backend
Typical error message follows:
Could not resolve hostname server.tf.local: Name or service not known
# or
connect: Invalid argument
Check that:
- your firewall is not blocking UDP port 5353
- on SUSE systems check YaST -> Security and Users -> Firewall -> Zones -> "public" and "libvirt"
- "mdns" should appear on the list on the right (allowed)
- Avahi is installed and running
- typically you can check this via systemd:
systemctl status avahi-daemon
- be sure you don't have mixed versions:
rpm -qa | grep avahi
- typically you can check this via systemd:
- Avahi is capable of resolving the host you are trying to reach:
avahi-resolve -n <host>.tf.local
- mdns is configured in glibc's Name Server Switch configuration file
In /etc/nsswitch.conf
you should see a hosts:
line that looks like the following:
hosts: files mdns4 [NOTFOUND=return] dns
mdns4
should be present in this line. If it is not, add it.
Note: mdns6
also exists to resolve to IPv6 addresses, but currently one known issue exists where it may return incorrect addresses (specifically: link local addresses starting with fe80:
but without a zone identifier trailer such as %eth0
). We recommend IPv4 for the time being.
Starting with nss-mdns
version 0.14.1, you also need to populate /etc/mdns.allow
with:
.local
.tf.local
mdns.allow
is required to force all multicast DNS domains to be resolved regardless of label count or unicast SOA records.
If there is a 5-second delay on any name resolution (or ping) between Avahi hosts, a likely cause is that ipv6 is enabled on the VMs (that is the default setting) but the network is blocking ipv6 traffic.
This can happen, for example, in libvirt if the NAT network being used does not have ipv6 enabled. Re-creating the network with ipv6 enabled fixes the problem.
Typical error message follows:
libvirt_domain.domain: Error creating libvirt domain: [Code-1] [Domain-10]
internal error: process exited while connecting to monitor:
2016-10-14T06:49:07.518689Z qemu-system-x86_64: -drive file=/var/lib/libvirt/images/mirror-main-disk,format=qcow2,if=none,id=drive-virtio-disk0:
Could not open '/var/lib/libvirt/images/mirror-main-disk':
Permission denied
# Another possible one is:
libvirt_domain.domain: Error creating libvirt domain: virError(Code=1, Domain=0, Message='internal error: child reported: Kernel does not provide mount namespace: Permission denied')
# Yet another one is:
Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory'
Some variants of this issue will also be logged in /var/log/syslog
like:
Oct 14 08:10:03 dell kernel: [52456.461754] audit: type=1400 audit(1476425403.666:27):
apparmor="DENIED" operation="open" profile="libvirt-4d4f9b58-f50d-4f60-a8a1-315c9a2d02c1"
name="/var/lib/libvirt/images/terraform_mirror_main_disk"
pid=4193 comm="qemu-system-x86" requested_mask="wr" denied_mask="wr"
fsuid=106 ouid=106
There are two possible causes: plain Unix permissions or AppArmor.
Check that the interested directory (in the case above /var/lib/libvirt/images
) is writable by the user running qemu
(you can check it via ps -eo uname:20,comm | grep qemu-system-x86
when you have at least one virtual machine running).
If the directory has wrong permissions:
- fix their them manually via
chmod
/chown
; - check that the pool definition has right user and permissions via
virsh pool-edit <POOL_NAME>
(note that mode is specified in octal and user/group as IDs, seeid -u <USERNAME>
andcat /etc/groups
).
If the user running qemu
is not the one you expected:
- change the
user
andgroup
variables in/etc/libvirt/qemu.conf
, typicallyroot
/root
will solve any problem - restart the libvirt daemon
AppArmor can be used both to isolate the host from its guests, and guests from one another. Both of those protections must be properly configured in order for sumaform to operate correctly. Unfortunately, this configuration can be difficult and is well beyond the scope of this guide. Following are instructions to disable AppArmor completely in non-security-sensitive environments.
- host-guest protection: add
security_driver = "none"
to `/etc/libvirt/qemu.conf - guest-guest protection:
sudo ln -s /etc/apparmor.d/usr.sbin.libvirtd /etc/apparmor.d/disable/
sudo /etc/init.d/apparmor restart
sudo aa-status | grep libvirt # output should be empty
If you run into this error:
dial tcp 192.168.122.193:22: getsockopt: network is unreachable
during the terraform apply
, it means Terraform was able to create a VM, but then is unable to log in via SSH do configure it. This is typically caused by network misconfiguration - ping 192.168.122.193
should work but it does not. Please double check your networking configuration. If you are using AWS, make sure your current IP is listed in the ssh_allowed_ips
variable (or check it is whitelisted in AWS Console -> VPC -> Security Groups).
Run sh /root/salt/highstate.sh
.
A: you can use Terraform's taint command to mark a resource to be re-created during the next terraform apply
. To get the correct name of the module and resource use terraform state list
:
$ terraform state list
...
module.server.module.server.libvirt_volume.main_disk[0]
$ terraform taint module.server.module.server.libvirt_volume.main_disk[0]
Resource instance module.server.module.server.libvirt_volume.main_disk[0] has been marked as tainted.
A: see above, use the taint command as per the following example:
$ terraform state list
...
module.base.libvirt_volume.volumes[2]
$ terraform taint module.base.libvirt_volume.volumes[2]
Resource instance module.base.libvirt_volume.volumes[2] has been marked as tainted.
Please note that any dependent volume and module should be tainted as well before applying (eg. if you are tainting the sles12sp5
image, make sure you either have no VMs based on that OS or that they are all tainted).
Terraform cannot find your SSH key in the default path ~/.ssh/id_rsa.pub
. See Accessing VMs for details.
If you have just switched from non-bridged to bridged networking or vice versa you might get the following error:
Error applying plan:
1 error(s) occurred:
libvirt_domain.domain: diffs didn't match during apply. This is a bug with Terraform and should be reported as a GitHub Issue.
...
Mismatch reason: attribute mismatch: network_interface.0.bridge
This is a known issue, simply repeat the terraform apply
command and it will go away.
If you run terraform apply
from outside of the sumaform tree, you will get the error message:
Error applying plan:
1 error(s) occurred:
stat salt: no such file or directory
A simple solution is to create a symbolic link pointing to the salt
directory on top level of the sumaform files tree. Create this symlink in your current directory.
If you want to work with more than one main.tf
file, for example to use both a libvirt and an AWS configuration, you can follow instructions in the README_ADVANCED.md file to set up multiple workspaces.
To change to another workspace just remove and create the corresponding links again, and then execute terraform init
.
When we change between workspaces,it may happen that terraform init
throws "is not a valid parameter" errors, as if we actually didn't change to another workspace. To resolve this just remove the terraform cache:
rm -r .terraform/
Sumaform uses JeOS and, by default, zypper
is configured to not install docs.
To change that behavior, edit /etc/zypp/zypp.conf
and change the property rpm.install.excludedocs = no
and re-install the package.
Q: How do I workaround a "doesn't match any of the checksums previously recorded in the dependency lock file" error?
This error can occur during a terraform init
execution:
$ terraform init
Initializing modules...
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of dmacvicar/libvirt from the dependency lock file
- Reusing previous version of hashicorp/null from the dependency lock file
- Reusing previous version of hashicorp/template from the dependency lock file
- Using previously-installed hashicorp/template v2.2.0
- Installing dmacvicar/libvirt v0.6.3...
- Using previously-installed hashicorp/null v3.1.0
╷
│ Error: Failed to install provider
│
│ Error while installing dmacvicar/libvirt v0.6.3: the local package for registry.terraform.io/dmacvicar/libvirt 0.6.3 doesn't match any of the checksums previously recorded in the dependency lock file (this might be because the available checksums are for packages
│ targeting different platforms)
Just delete the .terraform.lock.hcl
file inside your sumaform folder and do another terraform init
after that:
rm .terraform.lock.hcl
terraform init
See the terraform docs for more information on the dependency lock file.