Removing special characters from UTF8 input for use in email addresses or login names

When working with non-US customers, users often have characters in their names like ë, ó, ç and so on. Most of the time, a ‘human process’ converts these to their simple equivalent of e, o and c for use in computerized systems.

When searching for such a mapping of special characters to ‘safe’ characters I had a hard time finding a good list or PowerShell method to automatically convert special characters to standard A-Z characters so I wrote one:

function get-sanitizedUTF8Input{
    Param(
        [String]$inputString
    )
    $replaceTable = @{"ß"="ss";"à"="a";"á"="a";"â"="a";"ã"="a";"ä"="a";"å"="a";"æ"="ae";"ç"="c";"è"="e";"é"="e";"ê"="e";"ë"="e";"ì"="i";"í"="i";"î"="i";"ï"="i";"ð"="d";"ñ"="n";"ò"="o";"ó"="o";"ô"="o";"õ"="o";"ö"="o";"ø"="o";"ù"="u";"ú"="u";"û"="u";"ü"="u";"ý"="y";"þ"="p";"ÿ"="y"}

    foreach($key in $replaceTable.Keys){
        $inputString = $inputString -Replace($key,$replaceTable.$key)
    }
    $inputString = $inputString -replace '[^a-zA-Z0-9]', ''
    return $inputString
}

#example usage:
get-sanitizedUTF8Input -inputString "Jösè"
#result:
Jose

Edit: my colleague Gerbrand alerted me to a post by Grégory Schiro which solves this issue much more elegantly using native .NET functions. My slightly modified version to really ensure nothing non a-zA-Z0-9 gets past the function:

function Remove-DiacriticsAndSpaces
{
    Param(
        [String]$inputString
    )
    $objD = $inputString.Normalize([Text.NormalizationForm]::FormD)
    $sb = New-Object Text.StringBuilder
 
    for ($i = 0; $i -lt $objD.Length; $i++) {
        $c = [Globalization.CharUnicodeInfo]::GetUnicodeCategory($objD[$i])
        if($c -ne [Globalization.UnicodeCategory]::NonSpacingMark) {
          [void]$sb.Append($objD[$i])
        }
      }
    
    $sb = $sb.ToString().Normalize([Text.NormalizationForm]::FormC)
    return($sb -replace '[^a-zA-Z0-9]', '')
}
#example usage:
Remove-DiacriticsAndSpaces -inputString "Jösè"
#result:
Jose

And an even easier oneliner I converted to a function by John Seerden:

function Remove-DiacriticsAndSpaces
{
    Param(
        [String]$inputString
    )
    #replace diacritics
    $sb = [Text.Encoding]::ASCII.GetString([Text.Encoding]::GetEncoding("Cyrillic").GetBytes($inputString))

    #remove spaces and anything the above function may have missed
    return($sb -replace '[^a-zA-Z0-9]', '')
}

And the most advanced function I’ve found so far is by 
Daniele Catanesi (PsCustomObject): https://github.com/PsCustomObject/New-StringConversion/blob/master/New-StringConversion.ps1 in which all features of above functions are supported and parameterized.

Reporting on global tenant storage usage and per site storage usage

As my employer is a Microsoft Cloud Service Provider, we want to monitor the total storage available and the total storage used by all of the tenants we manage under CSP, including storage used by Sharepoint and Teams. This called for a script!

per customer total storage usage overview

I slimmed down the resulting script to work for just a single tenant that you can use to generate an XLSX report of which of your sites / teams are nearing their assigned storage quota. You can either build your own alerting around this to raise site quota’s before your users upload too much data, or you can use it to buy additional storage from Microsoft before your tenant reaches the maximum quota 🙂

per site storage overview in excel

As usual, find it on Gitlab!

Finding files in Sharepoint Online or Teams that exceed 218 path length

Update: new version of this script with GUI here 🙂

A well known issue when migrating to Office 365 (Sharepoint, Teams and Onedrive) is path length.

Recently, Microsoft increased the maximum path length in Sharepoint Online from 256 to 400 characters (total length of the URL). This causes issues when you use Office, because Office 2013, 2016 and 2019 do not support paths over 218 characters in length.*

To help you proactively identify files that exceed this limit I wrote a PowerShell script you can run:

  • it can filter based on file type
  • automatically finds and processes all sharepoint sites in your tenant
  • automatically finds and processes all team sites in your tenant
  • it can handle multi-factor authentication

Find get-filesWithLongPathsInOffice365.ps1 on GitLab

It leans heavily on the great work done by the community around OfficePnP, all credits to the community for providing so much quality code for free!

*longer paths may still work, this is not a hard limit

OnedriveMapper v3.18 released!

Version 3.18 of OneDriveMapper has been released

  • No longer forces PowerShell to use TLS 1.2 by default, but uses opportunistic TLS (if 1.2 doesn’t work, it’ll fall back to 1.1 or 1.0)
  • Auto pick an available driveletter if you set the driveletter of a mapping to ‘autodetect’
  • Recursive group member search (vs just 1 level deep) when mapping based on groups
  • Better retries/error handling when looking up favorited sites

Get the new version here

SAP SuccessFactors to Active Directory Sync (updated users)

For a customer that is using SuccessFactors to manage their employees / contractees, I wrote a script that will update the AD accounts of any person that is updated in SuccessFactors.

I expect you’ll have working knowledge on how to configure SF PerformanceManager to export the users you wish to update to a CSV file on the sFTP server SF provides for you.

With that, you should be able to configure the script. It’ll basically map any field you export to any field in Active Directory you wish. In some cases, such as the Manager field, special logic has been added to the script to look up the user’s manager. For other special fields you may have to write your own logic.

If you wish, the script will provide you with a full report in your email, for example:

Get it @ Gitlab directly:  https://gitlab.com/Lieben/assortedFunctions/tree/master/SuccessFactors%20and%20SAP

Microsoft 365, Azure, Automation & Code