*Or any use for that matter...
This is one that I should have learnt way earlier than I actually did if i'm being completely honest. One of my colleagues wrote a cool function for multithreading, and although this isn't it, it did made me look into how it all worked.
Some of the techniques can be quite complicated, especially for a DBA who's not as experienced with Powershell yet (and a ForEach loop just works right?) so I thought i'd post about the simplier methods that can be employed, and tweaks to get most of the benefits, without the complexity.
So, lets jump straight into it.
We often have collections/scripts to run against our servers, a list of servers, and it often ends up in something along the lines of...
$serverlist=invoke-sqlcmd $InventoryServer -Database Inventory -query "Select servername from tbl_SQLinventorylist"
ForEach($Server in $serverlist.servername)
{
invoke-sqlcmd $server -Database master -query "....."
}
..etc. etc.
All good and well, unless you have a huge estate of servers to loop through, and/or a limited time to run the collection. For example if you wanted to collect metrics or perform a heartbeat check every 1 minute. A loop against a large set would take a while, we hate RBAR (c/o Jeff Moden L) in SQL, so why do it in Powershell?
As a follow up in my next post, i'll show how to enhance on what is below, and actually use it for something useful by performing a connection check across your estate in parallel, and present the data in an easy viewing manner.
Thankfully, Powershell has been making it easier for us for years, and continues to do so through through new functionality.
Options
ForEach -Parallel switch (Newer versions of PS only)
Runspace pools
Jobs
Threaded Jobs.
I'm going to focus on Jobs and an enhancement to them that does away with most of their limitations, Threaded Jobs.
What is a Powershell Job?
A job is basically an asynchronous session running commands in the background. The key word here is asynchronous, so the job statement returns immediately, while the heavy lifting of the job is being done in the background. This allows for you to call multiple jobs, and have them all running in parallel. Lovely.
Well, potentially lovely, as these are all powershell.exe processes running, each with their memory requirements. Try and run too many of these at once, and your host OS may grind to a halt through resource starvation. Tough love.
Lets start with defining some basic info, and our target list.
$TaskName='GetSQLInfo'
$throttle = (Get-WmiObject –class Win32_processor).NumberOfCores
clear
# Our list of targets goes here.
$Serverlist=@()
$Serverlist='Server1','Server2','Server3'
So here, i've just set $throttle to the number of cores we have, and created an array ($Serverlist) with our list of servers. This could easily be your Invoke-sqlcmd command to your inventory tables, or a call to your CMS using Get-DbaRegServer from dbatools.
Next we create a scriptblock ($GetSQLInfo) of what we actually want to run, and include the $Server as a parameter as this is what we will pass in for each target server later on.
# Script Block of what you actually want to run.
$GetSQLInfo = {
Param($server)
$sqloutput=(Invoke-Sqlcmd -ServerInstance $server -Database master -Query 'SELECT @@servername AS Servername, getdate() as dt')
# Optional - Add some structure to the output
[pscustomobject]@{
Returnedserver=$sqloutput.Servername
Returnedtime=$sqloutput.dt }
}
# End of Script Block to run.
Now we loop through our array of target servers, (hey, I never said this was a Foreach replacement!) and start a job with the script block we have above.
$i=0
ForEach($Server in $serverlist)
{
$i++
while( @(Get-Job -State Running).Count -ge $Throttle)
{ Write-Host "Max concurrency reached...Throttling..." -ForegroundColor Yellow
Start-Sleep -Milliseconds 500}
# Call a job for the task for the server
Start-Job -Name $TaskName-$server -ScriptBlock $GetSQLInfo -ArgumentList $server
}
Write-Host "Threads started for all targets. Collating." -ForegroundColor Green
This is just
setting a counter
looping through the list of servers
checking that there are not currently more jobs than the ($throttle) we set earlier (Number of CPUs) and delaying until its below threshold
Starting a job with out scriptblock created ($GetSQLInfo) and passing the param for that of the target server ($server)
Output will look something like this.
Now we want to pull our results from those jobs. To do that, you use:
# Return results from the jobs
Get-Job | Wait-Job | Receive-Job | Select ReturnedServer, Returnedtime
Quick explaination...
Get-Job - gets the list of current jobs
Wait-Job - waits for jobs to be in a terminating state (ie... finished) before continuing
Receive-Job - gets resultsets from all of our jobs.
Voila. Our results.
Returnedserver : SERVER1
Returnedtime : 27/01/2024 17:19:08
Returnedserver : SERVER2
Returnedtime : 27/01/2024 17:19:08
Returnedserver : SERVER3
Returnedtime : 27/01/2024 17:19:09
Finally, as our parents often told us to tidy our rooms... clear up the jobs after we're done.
# Now Tidy Up
Get-Job | Remove-Job -Force
Thats it, your first parallel query.
But you told us to be careful of Jobs!
I did, as if you try to create too many at once, they can steal all your resource. So you may have to be careful and use a throttling method (using a $throttle variable and checking against current running jobs) as above. But this can slow things down lots, and seriously hurt the reasoning for running in parallel in the first place.
Where Angels Fear to Thread - ThreadJob
This is where Threaded Jobs come in. This is a module available ThreadJob in the Powershell Gallery - PowerShell Gallery | ThreadJob 2.0.3
I like this as there is virtually no change to the code that you use for 'normal' Powershell Jobs.
Essentially what this does is massively lower resource requirements, as each job now runs as a separate thread, rather than a separate process. Great, we can now eat more, faster. Yum.
As with any module - Install and Import.
Install-Module -Name ThreadJob
Import-module ThreadJob
There are other modules out there (PoshRSJob etc), which by all accounts are magnificent. I've taken to the above module as its virtually no additional learning curve to what is written here already, apart from...
...ready...?
Call Start-ThreadJob instead of Start-Job.
Oh, and add a -ThrottleLimit $throttle parameter to throttle it fully at that level too.
Start-ThreadJob -Name $TaskName-$server -ScriptBlock $GetSQLInfo -ArgumentList $server -ThrottleLimit $Throttle
Yup, that's it. No other changes. Done.
<#
Install-Module -Name ThreadJob -Force
import-module ThreadJob
#>
$TaskName='GetSQLInfo'
$throttle = (Get-WmiObject –class Win32_processor).NumberOfCores
clear
# Our list of targets goes here.
$Serverlist=@()
$Serverlist='Server1','Server2','Server3'
# Script Block of what you actually want to run.
$GetSQLInfo = {
Param($server)
$sqloutput=(Invoke-Sqlcmd -ServerInstance $server -Database master -Query 'SELECT @@servername AS Servername, getdate() as dt')
[pscustomobject]@{
Returnedserver=$sqloutput.Servername
Returnedtime=$sqloutput.dt
}
}
# End of Script Block to run.
$i=0
ForEach($Server in $serverlist)
{
$i++
while( @(Get-Job -State Running).Count -ge $Throttle)
{
Write-Host "Max concurrency reached...Throttling..." -ForegroundColor Yellow
Start-Sleep -Milliseconds 500
}
# Call a job for the task for the server
Start-Job -Name $TaskName-$server -ScriptBlock $GetSQLInfo -ArgumentList $server
# Or a Threaded Job, also uncomment module lines
# Start-ThreadJob -Name $TaskName-$server -ScriptBlock $GetSQLInfo -ArgumentList $server -ThrottleLimit $throttle
}
Write-Host "Threads started for all targets. Collating." -ForegroundColor Green
# Return results from the jobs
Get-Job | Wait-Job | Receive-Job | Select ReturnedServer, Returnedtime
# Now Tidy Up
Get-Job | Remove-Job -Force
Next up is a post around using the above to do a connectivity check of many SQL servers at the same time, and making the output a little more friendly.
Thanks for reading as ever.
POshRod.
(I'm not posh, seriously)
Comments