top of page

Automatic SQL Patch Metadata - Scrape it off my shoe.

Patching, we all know how important it is, for those of us with a presence on prem that is, or even SQL on Azure VM, without the correct extensions, you still have to patch it.


  • Theres a patch,

    • you download it,

      • you install it

        • great, beer time.


Next month/quarter/whatever

Theres a patch

you download it... etc


How do you find out about a patch being released? Subscribed Email?/newsletter?/calendar reminder?/carrier pigeon missive to check Blogspot manually?


(just for info, i've always refer to the SQL Server Version List (https://sqlserverbuilds.blogspot.com) as Blogspot for some reason)


Doesn't take long, but it would be nice to not have to bother at all, and I don't mean by moving everything to the cloud and let MS handle it either before anyone says it. :-)


The details are just websites, so the html is just plain text, AND the awesome people who provide Blogspot also add the details to a publicly shared google doc, in multiple formats, legends.


So what can we do to make our lives even easier. How about downloading/scraping all of the details that you might ever need, and dropping them into tables somewhere? So when its in SQL, you can then choose to trigger off that however you like?


Powershell time

Lets scrape some details, download some other details and drop it all in a table for your use. You'll see that some of this data is repeated, so you won't need all of what is below, i'm just pulling all of it together to give all options. Ultimately, you can pick and choose what you need, with the very basic data can being pulled from a single site for simplicity.


Target Table

First off, we'll need somewhere to store all of this. Here's a fat table with all the details we are going to pull. You can thin this out as much as you like.


CREATE TABLE dbo.SQLPatch (
	SQLServer varchar(10),
	Version varchar(10),
	Build varchar(15),
	SP tinyint,
	CU tinyint,
	FileVersion varchar(20),
	Description varchar(500),
	Link varchar(200),
	ReleaseDate date,
	isSP bit,
	isCU bit,
	isHF bit,
	isRTM bit,
	isCTP bit,
    isSU bit,
	New bit,
	TargetVersion bit,
	Withdrawn bit)

Source Code Files

Code can be downloaded from my github cesspit found at : sqlrodscripts/GetSQLPatchVersions at main · sqlrodbloke/sqlrodscripts


Execution

Here we go finally, just run the PS with a few choice params.

.\Get-SQLPatchVersions.ps1 -TargetSQLserver localhost -TargetSQLDB Inventory -patchfilebasepath c:\temp\SqlServerBuildDL

You'll end up with this:


And this:


Yay! Go You!


You've now got the data, simple as that. There'll be a few csv's in your temp folder as well with some of the data in.



Lets see what's going on...


Right-o, we'll start with the params we (you) may want to change.


Basically, Target Server, DB and Table. Also a temp location where we download a csv file to.


param(
    [Parameter(Mandatory=$true)] [string]$TargetSQLserver,
    [Parameter(Mandatory=$true)] [string]$TargetSQLDB, 
    [Parameter(Mandatory=$false)] [string]$TargetSQLTable="dbo.SQLPatch", 
    [Parameter(Mandatory=$false)] [string]$patchfilebasepath="C:\Temp\SqlServerBuildDL")

Now, as we're scraping data from websites

If URLs change, you will need to reference that here

Add the type:

Add-Type -Assembly System.Web

We'll create an array of the versions we're interested in pulling. Then we'll add the URL we want to scrape data from in. ($urlm), and then execute Invoke-WebRequest and pull the content back from that website, putting it in the $html variable.

$SqlVersion = "2008", "2008 R2", "2012", "2014", "2016", "2017","2019","2022"
Write-Host "Pulling Master SQL Version basic info from MS" -ForegroundColor Cyan

#Get info from MS pages - use this for stripping SP/CU levels later

$urlm = "https://learn.microsoft.com/en-us/troubleshoot/sql/releases/download-and-install-latest-updates#sql-server-complete-version-list-tables";

$html = (Invoke-WebRequest -Uri $urlm -UseBasicParsing).Content;

So what's in the $html variable right now? Effectively the html content of the web url that you have just executed against with the Web request.


And as this data is just text, we can parse it as such.


Next

Clear our table ready. You may want to be smarter about this, and stage the new pull first etc. etc.

  Write-Host "Clearing target table." -ForegroundColor Cyan

    $sqlStmt="TRUNCATE TABLE $TargetSQLTable"

    Invoke-Sqlcmd -ServerInstance $TargetSQLserver -Database $TargetSQLDB -Query $sqlStmt -QueryTimeout 15 -ConnectionTimeout 15 -TrustServerCertificate

Write-Host "Pulling additional info from Blogpsot Google doc csv" -ForegroundColor Cyan

So now, for each of our versions requried:


  • Build a query, dynamically adding the version in the URL we want, and the Query we want to run.

  • Run that query.


    This effectively downloads the CSV file locally for the version:

  • Import that back in to a csv formatted variable $patchinfo (because i'm lazy and letting PS structure/format things for me this way.)


(Blogspot also include code showing you how to do this on their site, they're so awesome.)


ForEach ($version in $SqlVersion){
   $Query = "select * where A='" + $version + "'"

        $patchfilepath=$patchfilebasepath+$Version+'.csv'
        $URLb   = "https://docs.google.com/spreadsheets/d/16Ymdz80xlCzb6CwRFVokwo0onkofVYFoSkc7mYe6pgw/gviz/tq?tq=" `
               + [System.Web.HttpUtility]::UrlEncode($Query) `
            + "&tqx=out:csv"

Invoke-WebRequest $URLb -OutFile $patchfilepath
#Import this into CSV format
$patchinfo=Import-csv $patchfilepath

Next, for each line in $patchinfo (its csv remember, one line per build number for the specific version), search in the previously saved html content (In $html above) with the correct search string, being the product build number in this case. If it exists, we have a starting point for our html parsing.


So for, say SQL 2019 CU 23.... we have 15.0.4335.1 as our search string.


And peeking back in our html... there it is, nicely wrapped in a table row for use to use.


Write-Host "Merging results for $version, extracting SP/CU numbers and loading to SQL table" -ForegroundColor Cyan    

#Loop through each patch line.
        ForEach ($patch in $patchinfo){
            $index =$null
            $sp=$null
            $cu=$null
#Tidy killer characters
$patch.Description=($patch.Description).Replace("'","")

#Now we want to strip out the SP and CU numbers for the patch, we use the MS data for this.           

If ($($patch.Version) -eq '10.0') {$searchVersion = $($patch.Build).Replace('.0.','.00.')}

Else {$searchVersion = $($patch.Build)}
	# find the version text

$index = $html.IndexOf("<td>$searchVersion</td>");

Why?

Wait, stop...why are we doing this?

Well, the data is either incomplete, or a mess in both sources, MS have a habit of changing description formatting between versions, or a hotfix, or general text structuring, which makes String parsing a nightmare. I'm using the specific SP and CU numbers from the MS location, and everything else from blogspot (as it contains a load of useful info and breakdowns of the data). As mentioned, you could just pull from MS if you wanted basic data. But I don't.


Now, back on the wagon....if the version searched for exists..


  • Find the start and the end positions of the surrounding html table row <tr> </tr>

  • Grab what's between, and drop this into an array.

  • For each line, parse the contents with a bit of Regex and store the value we want (SP, CU or both)


if ($index -ne -1) {  # find the start of the containing "<tr>"
                    $tr = $html.LastIndexOf("<tr>", $index);

# find the text inside the following "<tr>" plus its length
                    $start = $html.IndexOf("<tr>", $tr) + "<tr>".Length;
                    $end = $html.IndexOf("</tr>", $tr);
                    $name = $html.Substring($start, $end - $start);

                    $extract=$name.replace('<td>','') -split "</td>"

                    $detail=@()
                    Foreach ($line in $extract.Trim()){
                        $detail += $line
                    }

<# If you want the base detail from MS -just use
                        $build=$detail[0]
                        $sp=$detail[1]
                        $su=$detail[2]
                        $link=$detail[3]
                        $ReleaseDate=$detail[4]
 #>

#Parse CU and SP values
$sp=If ($detail[1] -match '^SP[1-9]') {$detail[1].Substring(2,1)}           
$cu=If ($detail[2] -match '^CU[1-9][0-9]') {$detail[2].Substring(2,2)} 

If ($cu -eq $null) {
       $cu=If ($detail[2] -match '^CU[1-9]') {$detail[2].Substring(2,1)}
       }
}

We now have all the data we want, so lets lob it into SQL. We've got all the info we want now between the 2 pulls.

$sqlStmt="INSERT INTO $TargetSQLTable (SQLServer, Version, Build, SP, CU, FileVersion, Description, Link, ReleaseDate, isSP, isCU, isHF, isRTM, isCTP, isSU, New, TargetVersion, Withdrawn)
  VALUES ('$($patch.sqlServer)', 
           '$($patch.Version)',
           '$($patch.Build)',
           '$sp',
		  '$cu',
           '$($patch.FileVersion)', 
            '$($patch.Description)',
            '$($patch.Link)',
            '$($patch.ReleaseDate)',
            '$($patch.SP)',
            '$($patch.CU)',
            '$($patch.HF)',
            '$($patch.RTM)',
            '$($patch.CTP)',
             NULL, 
            '$($patch.New)',
            NULL,
            '$($patch.Withdrawn)')"

Invoke-Sqlcmd -ServerInstance $TargetSQLserver -Database $TargetSQLDB -Query $sqlStmt -QueryTimeout 15 -ConnectionTimeout 15 -TrustServerCertificate

Finally we'll do a bit of tidying for any straggling data.


Write-host "Tidying dirty data" -ForegroundColor Cyan

$sqlstmt="UPDATE $TargetSQLTable SET isSU=CASE WHEN Description LIKE '%Security Update%' THEN 1 ELSE 0 END;

UPDATE $TargetSQLTable SET CU= CASE WHEN Description LIKE '%(CU%' THEN SUBSTRING(Description, CHARINDEX('(CU',Description)+3, (CHARINDEX(')',Description)-1)-(CHARINDEX('(CU',Description)+2)) END WHERE isCU=1 and CU=0;

UPDATE $TargetSQLTable SET SP=CASE WHEN Description LIKE '%Service Pack%' THEN SUBSTRING(Description, CHARINDEX('Service Pack', Description)+13,(CHARINDEX('(', Description))-(CHARINDEX('Service Pack', Description)+13) ) END WHERE isSP =1 and SP=0;"

Invoke-Sqlcmd -ServerInstance $TargetSQLserver -Database $TargetSQLDB -Query $sqlStmt -QueryTimeout 15 -ConnectionTimeout 15 -TrustServerCertificate

Summary

And that's it, the most recent data is now downloaded/scraped and loaded into a SQL Table for you to use however you wish.

  • Bomb a distribution list with alerts that they need to do something? - You got it.

  • Create a fancy PBI report detailing the estate state? - Tick.

  • Just improve your SQL inventory system with upto date data? - Check.


Key takeaway is that you have the data, and its auto updating.


Probably more than you'll need, but its all there and you can thin as needed. You will need to schedule this, but as its simple PS, you can use whatever you like to do that.


You may notice that there are some fields unpopulated... ie TargetVersion. The reason I like to include this is that different companies have different policies for patching... The most common i've seen are:

  • Patch n-1 - Always stay 1 version behind the latest.

  • Latest Patch version > x days - ie.. patch to the latest version that is x days old.


A simple TSQL statement can now set the version that your policy says you should be applying at any one point.


End of Life arguments


We've been there, having old servers that nobody wants to upgrade. Wouldn't it be useful to automatically pull the dates, so then we could write logic/reports around this?

The same simple web scraping technique can be used to pull other useful info, and populate it into tables for usage. In the next post, i'll show how to do this to pull SQL End of life dates, and populate tables as needed.


Thanks for reading.

Spider Scrapegoat Rod


Original main picture courtesy: of Pixabay. link

Drop Me a Line, Let Me Know What You Think

Thanks for submitting!

© 2035 by Train of Thoughts. Powered and secured by Wix

bottom of page