This post is about what I thought of an odd behaviour when calling the .NET String.Split method with multiple separator characters from PowerShell. I first came across this myself but didn’t really pay much attention to it. Only after reading about it again over on Tommy Maynard’s blog, I decided to find out more.
Let’s have a look at an example first:
[code language=”powershell”]
#using String.Split with one separator character works as expected
‘This is a test’.Split(‘e’)
#using multiple characters not so much
‘c:\test’.Split(‘\’)
‘c:\test’.Split(‘\’).Count
[/code]
When running the second example trying to split a string based on double backslashes the result is an array of 3 strings instead of two. Let’s try to see why this is happening by retrieving the specific overload definition we are using:
[code language=”powershell”]
#get the overload definition of the method we are using
‘‘.Split.OverloadDefinitions[0]
#string[] Split(Params char[] separator)
[/code]
Ok, it looks like this overload of the Split method expects a character array for the separator parameter. That is why we saw an additional split, every character of the string argument ‘\’ is considered as a unique separator. Let’s see if String.Split has other overload definitions that accept a String as the separator argument:
[code language=”powershell”]
‘‘.Split.OverloadDefinitions | Select-String ‘string[] separator’ -SimpleMatch
<#
string[] Split(string[] separator, System.StringSplitOptions options)
string[] Split(string[] separator, int count, System.StringSplitOptions options)
#>
[/code]
Indeed, there are two overloads that accept a String array argument instead. Let’s use the first one. We don’t need the StringSplitOptions parameter in this case and can therefore use a value of ‘None’ for the argument.
[code language=”powershell”]
#this doesn’t work since we need a String array
‘c:\test’.Split(‘\’, ‘None’)
#finally we get only two parts back
‘c:\test’.Split(@(‘\’), ‘None’)
‘c:\test’.Split(@(‘\’), ‘None’).Count
[/code]
We could have used the -split operator in the first place, but that would have been to easy, right ;-). Furthermore with the String.Split method we can also split a string by multiple strings in just one go:
[code language=”powershell”]
#using -split operator we need to escape the \ by doubling them since we are dealing with regular expressions
‘c:\test’ -split ‘\\’
#splitting by two strings
‘split by xx and yy in one go’.Split((‘xx’,’yy’),’None’)
#can be done also with -split using a scriptBlock
[/code] In conclusion, PowerShell provides a lot of options when it comes to splitting strings. Only looking at the separator parameter we have five options:
-
Using String.Split’s first overload with a character array
-
Using one of String.Split’s overloads that accept a string array
-
Using the -split operator which accepts a string for the separator parameter (the string is actually interpreted as a regular expression)
-
Using the -split operator which also accepts a ScriptBlock to determine the split operation. With that one can do a lot of things within the ScriptBlock $_ represents the current character, $args[0] the entire string, and $args[1] the current position within the entire string
-
Finally there is also the .NET Regex.Split method with even more options but very similar to the -split operator