It has long annoyed me that all the scaling options in Azure just add and remove hosts. They never target the host itself. Hosts are either under or overutilized in 84% of the case.
And this is especially relevant for AVD personal hostpools where users each have their own personal “VDI”.
So I’m releasing a custom PowerShell module called “ADDRS” (Azure Data Driven Right Sizing) that grabs mem/cpu performance of the VM or all VM’s in a resource group you tell it to check. It will then do some smart voodoo magic to determine what size out of an allowlist best fits.
Instructions / Example:
Use -WhatIf if you don’t want it to resize the VM
Use -Force if you want to resize a VM even if it is online (which will cause it to be shut down!)
Use -Boot if you want the VM to be started after resizing (by default it will stay deallocated)
Use -domain with your domain if your VM is domain joined
Use -region if your region is not westeurope
Use -Verbose if you want the full output incl financial projection
Use -Report if you want to output data to csv. Can be used together with -WhatIf
Modify minMemoryGB, maxMemoryGB, minvCPUs, maxvCPUs as desired for your usecase
You can adjust the preconfigured $allowedVMTypes array to allow specific VM types
use -maintenanceWindowStartHour, -maintenanceWindowLengthInHours and –maintenanceWindowDay if you want to ignore performance data during a maintenance window (e.g. for patching) as that isn’t representative
Set an Azure Tag called LCRightSizeConfig with the value disabled on machines you want to ignore
Set an Azure Tag called LCRightSizeConfig with a machine type value (e.g. “Standard_D4ds_v4“) if you want to lock a specific size for that machine, this can be useful if you want the script to resize from current to target automatically when it runs while the VM has been deallocated.
Example -Verbose output of two VM’s being resized:
The module will check if there is sufficient data about the machine in Azure Monitor, if not, no action will be taken. You can determine how far back the function looks by modifying $measurePeriodHours
It is recommended to match job schedules to the lookback period, or at least not run multiple times in the same lookback period. Otherwise, the data that is being used for sizing may not be representative if the machine had already been resized in an earlier run.
Make sure you’ve got enough data in Log Analytics
Make sure the allowedVMTypes list contains only VM types that you can actually upgrade to. If e.g. your VM has an ephemeral disk, and your allowList has types that do not, the resize will fail with an error message (but no harm will be done to the existing VM)
I’ve only tested the maintenance window parameters using UTC time, if you’re using different timezones your results in excluding data generated during the maintenance window may vary from mine
To allow admins further customization of these settings, I’ve written a Proactive Remediation script that can customize any VPN profile property to any value you specify.
In our case, we used it to set IpInterfaceMetric, which defaults to 0, causing ambiguously routed traffic to never prefer the VPN connection (since this is a split tunnel connection). Setting it to 1 resolved our DNS/routing issues to certain private endpoints in our Azure environment.
Therefore, here’s another runbook you may run to just report on your inactive devices, or to automatically (and optionally periodically) clean up inactive devices in your environment when the removeInactiveDevices switch is supplied.
When run locally, interactive sign in is required. When running as a runbook in Azure automation, the Managed Identity of the automation account is leveraged. This requires you to set Device.ReadWrite.All or Device.Read.All permissions depending on if you want to script to do the cleanup as well. If doing cleanup, also add the managed identity to the cloud device administrator (Azure AD) role.
Autopilot / on premises devices
Note that the script will log an error (and not attempt to delete the device) when a device is an autopilot record (not a real device) or when the device is synced from an on-premises active directory.
Disable vs Delete
The runbook also has a disable option, in which it will first disable a device and wait a configurable ($disableDurationInDays) period of time before actually deleting a device.
Microsoft started using these properties in april 2020, so accounts active before that will seem like they have never been active.
This script supports running non-interactive as a runbook in Azure Automation if you supply the -nonInteractive switch. Before this will work, you’ll have to enable Managed Identity on your automation account and run a small script to assign graph permissions to the Managed Identity: AuditLog.Read.All and Organization.Read.All
As always this script is provided as-is and should be reviewed and then used at your own risk.