Speeding up PowerShell lookups across large Collections

This week I needed to create a report based on information returned from two queries. The query results where contained in two separate collections (50k+ objects each). Taking the smaller filtered collection and looking up the other collection for the additional information using PowerShell like this proved frustrating slow:

$extraData = $collection2 | Where-Object {$_.UserPrincipalName -eq $collection1.UserPrincipalName } | Select-Object

An alternative then was to query directly (via an API) for the additional information whilst iterating through the main collection rather than searching for it in the other collection e.g.

foreach ($obj in $collection1){ $extraData = Invoke-RestMethod -method GET ...... }

That too was way too slow and wasn’t really being a nice NET citizen for the API on the end of 50k+ queries.

Solution

My solution was to join the two collections of objects and then build my report based off just one collection. Step in the Join-Object function from Warren F.

Join-Object provides a lot of flexibility on how and what to join between collections. For my requirements I just needed to use Join-Object to join based on a common key and bring in all the data from the other collection. That then looked like this in PowerShell;

$reportData = Join-Object -Left $collection1 -Right $collection2 -LeftJoinProperty UserPrincipalName -RightJoinProperty UserPrincipalName -Type AllInLeft
Whilst this one line to join my two collections takes just over an hour to execute my entire report now completes in less than 90 minutes vs the 5 1/2 hours it was previously taking to run. Thx psCookieMonster