Wednesday, October 13, 2010

PowerShell GCI (Get-ChildItem) LiteralPath Issue

Test Music Directory for Powershell Scripts
After finishing ripping our CD collection to FLAC format (link) we wrote some PowerShell scripts to analyze the directory hierarchy that the FLAC (*.flac) files reside in. The hierarchy is something like this \\server\music\artist\album\track.flac where “artist”, “album”, and “track” change according the CD and track ripped. Starting at the top of the node \\server\music the scripts generate the following stats:

1. Number of album folders that do not have music (at least one .flac files). This can indicate that something went wrong and needs to be checked.

2. Number of artist folders that have no album folders in them. For housekeeping reasons it’s not necessary to have empty folders hanging around.

3. Number of .flac files that are less than 500 kb. This can indicate a potential problem with the ripping process. Some .flac files can be that small such as hidden files or silent tracks.

In the course of ripping CDs the following situation for 2+ disc collection can occur:

\\server\music\artistBlah\albumBlah [Disc 1]
\\server\music\artistBlah\albumBlah [Disc 2]

where the “[Disc 1]” and “[Disc 2]” are part of the folder name.

So we have our directory structure defined, what are the basics of the PowerShell scripts? They are .ps1 files that run at a PowerShell command prompt. The key command used in the scripts is the Get-ChildItem cmdlet, or commonly typed as just gci. The gci cmdlet operates like a directory listing. The gci cmdlet has several switches that help refine what you want to have returned like only files with certain file extensions (*.flac). The scripts work by first invoking a gci (with appropriate switches) at the root directory (\\server\music), capturing that into an array, and then iterating through that array and checking each array entry for specific criteria. The problem we ran into is that some switches of the gci cmdlet did not work well with the use of brackets “[“ and “]” in the folder name.

To make the problem we encountered more concrete, let’s suppose we have the following directory structure:

..\artist1\album1 [disc 1]
file.flac
folder.jpg
..\artist1\album1 [disc 2]
file.flac
..\artist1\album2
file.flac

In a PowerShell command window if you invoke the following command:
gci -path ("\\server\music\artist1\album1 [disc 1]") -recurse -include *.flac

The result is that nothing is found when we are expecting that is should return the file.flac.

Similarly, if you invoke the following command:
gci -path ("\\server\music\artist1\album2") -recurse -include *.flac

The result returns file.flac as expected.

The problem with the first command is the use of brackets in the first path (folder name) which are perfectly legal to use in folder names but PowerShell interprets these as range operators as discussed in this TechNet Tip. To get around the problem of having PowerShell interpret the range character we can use the -literalpath switch of the gci cmdlet.

So invoking this command:
gci -literalpath ("\\server\music\artist1\album 1 [disc 1]")

Returns file.flac as expected. The problem of using –literalpath intead of –path is that the –include switch seems to be ignored so we can’t refine what we are looking for. One workaround for this – and the point of this post – is that you have to then iterate through the items in each album folder. The following code snippet shows a workaround that will work:

$assetList = gci -literalpath ("C:\temp\music\artist1\album 1 [disc 1]")
if ($assetList -ne $null)
{
foreach ($item in $assetList )
{
$n = $item.Name.ToLower();
if ($n.EndsWith(".flac"))
{
#Do something to indicate a match was found.
}
}
}

For more information on this problem see this TechNet Tip and this forum post. Here’s a simple PowerShell script for iterating through a two level directory structure looking for *.flac files.


#
# usage : .\findDirectoriesWithNoMusic.ps1 "root"
# example: .\findDirectoriesWithNoMusic.ps1 "\\server\music"
# output : report.txt
#

$arg = $args[0]
if ($arg.Length -lt 1)
{
write-host "****************************************************
No root directory specified...
usage: $output = .\findDirectoriesWithNoMusic.ps1 root
****************************************************"
exit
}

Write-Progress "Getting directory information" -status "This could take several minutes for a large number of directories."
$list = gci -path ($arg) -recurse -exclude *.* # the first use of GCI

$output = "Report of directories with no music `n"

for ($i=0; $i -le $list.Length - 1; $i++)
{
if ($list[$i].Parent.FullName -ne $arg)
{
$albumPath = ($list[$i].Parent.Name + "\" + $list[$i].Name).ToLower()
$assetList = gci -literalpath ($arg + "\" + $albumPath) # the second use of GCI because we can't be sure if brackets [] exist in the $albumPath
$flag = 0;
if ($assetList -ne $null)
{
foreach ($item in $assetList )
{
$n = $item.Name.ToLower();
if ($n.EndsWith(".flac") -or $n.EndsWith(".mp3") -or $n.EndsWith(".wma") -or $n.EndsWith(".mp4") -or $n.EndsWith(".wav") -or $n.EndsWith(".m4a"))
{
$flag = 1;
}
}
}
if ($flag -eq 0) { $output += "rmdir " + "`"" + $albumPath + "`" /s /q" + "`n" }
}
Write-Progress "Creating report ... " -status "% Complete" -percentcomplete (100*($i/$list.Length))
}

$output out-file report.txt

No comments:

Post a Comment