Azure SQL DB Failover Group Orphaned Users

Published On: 2019-07-19By:

I have a client  that resides in the Azure data platform arena, focusing mainly on Platform as a Service (PaaS) and Azure SQL DB.  They have a decent landscape of Azure resources and currently utilize failover groups within SQL DB to facilitate high availability.  Under the covers, failover groups are essentially Availability Groups and have similar issues that you might encounter with the on-premises version.

A common issue that you might encounter, which my client did recently, revolves around orphaned users.  Orphaned users occur when the user object in the database has a different SID (security identifier) than what the login says it should be.  It is also possible that the login may not exist at the server level.  Orphaned users are also specifically related SQL Logins and not Active Directory authentication.  When dealing with on-premises databases, this was commonly found when restoring a database from one server to another and you had to manually adjust the database user to fix the incorrect SID.  Regarding Azure SQL DB and failover groups, orphaned users can also occur.  The login is first created on the primary server and then the database user is created in the user database.  The syntax would look like this:

-- While in Master
CREATE LOGIN [User2] WITH PASSWORD = 'ThisIsNotThePassword1234'
GO
-- While in the user database
CREATE USER [User2] FROM LOGIN [USER2]
GO

As soon as the database user is created the command is sent to the secondary replicas.  However, the login is not sent, and this causes the SID of the database user to not match any corresponding logins on the secondary server.  In addition,  just like availability groups, databases contained in a failover group are read-only. Therefore you cannot modify the data or the database including user objects.  You can modify the login in the Master database though, which allows you to resolve the issue.

Let’s take a look.

Setup

I’ve got two servers in my Azure subscription, one is located in the East US region and the other in the West US region, each hosting a SQL DB named “Demo1”.

I have configured a fail-over group between these two servers such that sqldbdemo-east is the primary and sqldbdemo-west is the secondary.

Using SSMS, we can connect to both servers and see the that User1 is a login for both servers.  User1 is also a database user for Demo1 on both servers (it’s not shown but trust me).  Since sqldbdemo-east  is the primary we can add a new user .

Before we get started remember with SQL DB you can’t use a USE statement so you must be in Master to create the login.

CREATE LOGIN [User2] WITH PASSWORD = ‘ThisIsNotThePassword1234’;

Once the login has been created, we can now create it within the database Demo1.  Make sure to change the context of the query to the database in question.

CREATE USER [User2] FROM LOGIN [USER2]

We can verify that the new user exists in Demo1 on sqldbdemo-east.

If we check the secondary database on sqldbdemo-west, we will also see that the user was created there.   Remember that this user was created by the failover group automatically.  You do not have to manually create this user; it will be done for you.

 

We can also compare the SIDs for both users to ensure that they are the same:

However, the issue comes into play because the login does not exist yet for User2 on the sqldbdemo-west.

 

Since there isn’t a login associated with the user, someone using the credentials for User2 will not be able to authenticate properly.  Most likely you will get this type of error:

This can be fixed by adding the login User2 to sqldbdemo-west.  We will use the same password that was used to create User2 on sqldbdemo-east.

Once the login is created and granted access to Master, we can then log into sqldbdemo-west as shown below.

Here we can see that User2 has been authenticated and I can see the databases on the server.  However, if I attempt to access Demo1, I get an error:

The SIDs Did It

This issue occurs because the SIDs for the two logins are not the same.  Since they are not the same, the user object in Demo1 on sqldbdemo-west is unable to authenticate against it.   Remember, that because it is a secondary replica, the database is read-only.  You will not be able to do anything with the user object.  Instead, you will have to drop and recreate the User2 login on secondary server with appropriate SID.   You can find the correct SID to use by looking in these places on the primary server:

  • Database level – Sys.database_principals
  • Database level – Sys.sysusers
  • Master level – Sys.sql_logins

Once you have obtained the correct SID value, while in the context of Master on the secondary server, do the following:

-- While in Master
DROP LOGIN [User2]
GO
CREATE LOGIN [User2] WITH PASSWORD = 'ThisIsNotThePassword1234', SID = 0x01060000000000640000000000000000CA6EAC7C69DC024DBB850F80F8E595E6
GO

Now that the database user SID matches the login SID, User2 can now authenticate to the server as well as access the database

Summary

The cloud offers up a lot of new things however orphaned users can happen there just like they do with on-premises instances.  This will only happen with SQL logins but the resolution is straight forward once you see where the issue lies.  The next time you have to create user accounts in a fail over group, make sure to use the right SID from the start and you will save yourself some headache.

 

 

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC

Getting Your SQL Server Backup Footprint

Published On: 2019-06-28By:

Recently, I needed to be able to determine the total size for all the most recent full backups for certain servers. Luckily for me these particular servers were using Ola Hallengren’s Maintenance Solution which simplified things since each back up had it’s own specific directory. Powershell has the flexibility to easily provide what I need with little effort.

Let’s take a look!!

First, I usually prefer to have production backups stored separately from non-production backups however in this case this type of structure wasn’t feasible.  All of the servers backed up to the same location, including DR servers. So, I needed a way to specify which exact directories I wanted to look through for the FULL backup files since they task was to only provide this information for production databases  In order to facilitate this, I used a simple array of server names.  Remember that with Ola’s solution each server will have its own parent folder effectively segregating the folders that I want away from the folders I don’t need.

#sum of most recent full backup files
$srvs = @("server1","server2","server3")

Now that I know which folders I’m going to go look through, I can use a ForEach loop to iterate through the array.  Essentially, I want Powershell to do something for each server listed within the server array.

Next, I need to build out the path to go look for.

$path = "D:\backups\$($srv)"

Once the path is built, I can use the Get-ChildItem (or aliased “gci”) to recurse through the path variable created above.  The Get-ChildItem cmdlet is then piped into the Where-Object cmdlet where I can specify search filters, in this case, the file extension of “.bak” and a last write time that is greater than 1 day ago.  The resulting object set of files is placed into the $x variable.

$x = gci -path $path -Recurse | where-object {($_.extension -eq ".bak") -and ($_.lastWriteTime -gt (get-date).AddDays(-1))}

Keep in mind that the $x variable only holds a list of files for which ever server the foreach loop is currently processing.  Remember that we are processing the full backup files for each server individually.   Since $x is an object in Powershell, we can use the Measure-Object cmdlet to sum up the length of all files.  This will give us the total size of all of the backup files that were created within the past day for that given server.

[long]$amt +=($x | measure-object -sum length).sum | out-string

We also want to be able to retain each total for each iteration and add to the previous amount.  We do this by declaring another variable, $amt, and using PowerShell’s additive operator which will add the sum value to whatever value $amt currently is.  In SQL Server T-SQL language, this is “SET @X = @X + @sum”.  Note that we also declare the $amt variable as a “long” since the value from the sum is in bytes and that will most likely be a large number.

Here is the entire script.  You can also find modifications on my GitHub account.

#sum of most recent full backup files
$srvs = @("server1","server2","server3")
foreach ($srv in $srvs){

$x = gci -path $path -Recurse | where-object {($_.extension -eq ".bak") -and ($_.lastWriteTime -gt (get-date).AddDays(-1))}
[long]$amt +=($x | measure-object -sum length).sum | out-string
}
$amt

Summary

There are many ways to skin the perveribal cat but this is just a quick and easy way to determine how big all of your backup files might be.  You can find this information out via T-SQL as well, however this method allows you to quickly process multiple servers, assuming of course all of the backup files are in the same location.  Of course, there very well could be a more efficient way to accomplish this.  I am not a Powershell expert by any means so if you have you done something similar please share your story with me!

Keep in mind that this solution was specifically geared around utilizing Ola Hallengren’s Maintenance Solution.  If you are using a different file structure for your backup files, you will need to adjust the script accordingly.

Got a better way?  Should I have done something differently?  Shout it out!

 

 

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC

SQL Server 2008 End of Support D-Day

Published On: 2019-06-21By:

As the saying goes, time flies when you are having fun.  We are already on the 2nd half of June in 2019 and if you were not aware, extended support for Microsoft SQL Server 2008 SP4 and 2008 R2 SP3 ends on July, 9th, 2019.  Previous service packs for each product have already since expired and hopefully you are on the latest service pack anyway.  Keep in mind that this also includes regular security updates.  For those that might be counting, that is 19 days away (including the due date).  19 days and do you know where your 2008 & 2008 R2 servers are at? If so, do you have a plan?

D-Day

The documentation from Microsoft is fairly clear on the dates. If you have not read the documentation, I recommend you take a look, to get an idea of when your products sunset. Shown below, you can see that July 9th, 2019 is indeed the final day that Microsoft will provide support for the highlighted .

What does that really mean though?

It means that if you are still running those product versions, and you very well could be (I still have clients with SQL 2005 running in Production), that any support you wish to garner from Microsoft will have a financial impact.  In other words, if you need help, you’ll have to crack open the bank account and fork over some cash to Microsoft.  I don’t know how much that will be, however Microsoft is pushing pretty hard to get people to address these older versions of the product.

Mitigation

Hopefully you are already well aware of this looming deadline and have already put an action plan into motion ensuring that your production environment continues to be supported.  As previously mentioned, there are several things that you can do to ensure support.

Azure – If you migrated your on-premises 2008/2008 R2 workloads into an Azure VM (Infrastructure as a Service), you will continue to get extended support for the product at no extra charge for another 3 years.  Keep in mind, that this extension is intended for you to eventually migrate away from 2008/2008R2 onto new versions of the product.  However, this would give you a little breathing room as you formulate and implement an action plan.

Upgrade – If moving to Azure is not a viable option, you can migrate to a newer version of SQL Server.  Not only will you continue to get support with new versions, but there are some vast improvements in the product so you might also get a performance boost along with it.

Purchase Support – If the previous two options don’t work, as a last resort, you can purchase an extended support plan for SQL Server 2008 & 2008 R2.  While I do not know what kind of price point this will be, I suspect that it might not be cheap.  As I mentioned, Microsoft is pushing fairly hard to get organizations to move to newer versions.  If you are interested in this, contact your local Microsoft licensing partner for further details.

Summary

The deadline is quickly approaching, and those 19 days will be gone in a blink.  Thankfully, Microsoft has provided several ways that you can mitigate losing extended support for SQL Server 2008 & 2008 R2.  Do your due diligence and look at your environment and formula an action plan that you can put into place.

Afterall, the servers you save just might be your own

 

 

 

 

 

 

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC

Don’t Forget RoboCopy

Published On: 2019-05-03By:

On occasion you have to find means to get around an issue.  Recently at a client I discovered that an unknown SQL Server that wasn’t being properly cared for.  This particular instance wasn’t taking backups and it was deemed as a critical production server.  You can begin to see the problems that were arising.

However, there is a twist to the story.  Due to changes in active directory, I couldn’t get the appropriate permissions configured for the SQL Server service to allow it to backup to the network destination.  I also could not take an outage anytime soon to change the service account to something better suited.  It’s a long and convoluted story for another post.  The immediate need was to 1) get backups of all of the databases locally and 2) find a way to move them off of the local server.  I also needed to do this on a schedule so that I could meet the RPO/RTO guidelines the client had set forth.

My solution?  RoboCopy.

Robocopy has been around for years within the Microsoft eco-system and it is highly versatile.  However, until recently, it wasn’t a tool that I was versed at and frankly, hardly used it.  Over the past year or so, however, I have found myself using it more and more as a solution to for file movement when needed.

Essentially, robocopy will move files or even directories from one location to another.  It will copy permissions without any additional coding, and it will only copy the files that do not exist within the destination.  This is useful in that you do not have to explicitly account for files that might already exist at the destination.  It handles it for you.  Some other benefits of robocopy:

  • Ability to restart in the event it fails, both reading and writing files
  • Handles subdirectories
  • Copy file attributes
  • Can mirror directories
  • Move files and then delete the source once they are copied
  • many others…

You can see all of the options from the documentation from Microsoft.

In this case, I’m interested in using the following switches:

  • /s – process all of the subdirectories from the specified root folder
  • /z – copy files in restartable mode.  If it dies in the middle, it can pick up where it left off
  • /eta – shows the estimated time of arrival of the files
  • /r:3 – retry the reads for 3 before dying. The default is 1 million tries.
  • /w:3 – number of seconds to wait before retrying. The default is 30 seconds

RoboCopy In Action

In this case, using RoboCopy is pretty straight forward.  I have Ola Hallengren’s Maintenance Solution running on the SQL Server instance, so I configured it to backup to a specific directory on the local server.   I also make note of the available drive space.  On this particular instance, the databases are relatively small.  3 days of backups stored locally is less than 5GB so not heavy disk usage.  If you have larger databases, you might need to adjust how many backups you keep locally so that you do not consume the entire drive.  Adjust accordingly.

The script itself is pretty simple and follows this format with the previously listed switches:

robocopy [source] [destination] /s /z /eta /r:3 /w:3

or

robocopy c:\temp\servername \\naservername\sharename$ /s /z /eta /r:3 /w:3

SQL Agent Job

We currently do transaction log backups of production databases every 15 minutes.  Since I am backing up the database as well as transaction logs to the local disk, I needed to be able to push them up to the network storage on a regular basis.  You can use Windows Task Scheduler to facilitate this, however since I’m more versed in the SQL Server side of things, I decided to put this into an agent job.  This gave me a lot of flexibility of scheduling and given that the files should be small, I can schedule the job to run more frequently.  Running the job more frequently allows me to exceed the RPO/RTO requirements without a lot of extra overhead.

To set this up, first, we have to put the command into a batch file in which we can call from the agent job.  In the batch file, we have to adjust the error levels so that robocopy will suppress a successful exit status and only report on actually errors. We do this because robocopy will report a successful exit as a non-zero ERRORLEVEL.  SQL Server agent views as any non-zero exit a failure and will subsequently fail the step.  I borrowed the below code from Rob Volk (T) and one of his blog posts.

rem suppress successful robocopy exit statuses, only report genuine errors (bitmask 16 and 8 settings)
set/A errlev="%ERRORLEVEL% & 24"

Next, we have to adjust for an appropriate exit so that the agent job can success or fail correctly.

rem exit batch file with errorlevel so SQL job can succeed or fail appropriately
exit/B %errlev%

The entire batch file now looks like this:

rem suppress successful robocopy exit statuses, only report genuine errors (bitmask 16 and 8 settings) 
set/A errlev="%ERRORLEVEL% & 24" 

rem exit batch file with errorlevel so SQL job can succeed or fail appropriately 
exit/B %errlev%

robocopy c:\temp\servername \\naservername\sharename$ /s /z /eta /r:3 /w:3

Once the batch file was created, I could create a standard agent job with a step that would execute the batch file.  It looks like this:

You can see that the type has been set to “Operating System”.  The “/c” switch tells cmd.exe to execute the string supplied, which in this case is the batch file that I had created.

Since the client has a RTO/RPO of 15 minutes, I scheduled the job to run every 5 minutes.  This helps ensure that any recent backups are pushed to a secondary off-site storage within the 15 minute limitation in the event of a hardware failure.

Summary

The real solution to this issue would to adjust permissions properly for the SQL Server service or change the service account altogether.  This would allow the SQL Serve to backup directly to the network destination without needing a mitigating step in between.  However, there are times where that might not be possible, and you have to find another solution.  In this case, RoboCopy was a great middle man solution that fits all of my requirements.   Next time you need to move files around, remember that RoboCopy might just fit the bill.

Enjoy!

© 2019, John Morehouse. All rights reserved.


Contact the Author | Contact DCAC
1 2 3 17

Video

Globally Recognized Expertise

As Microsoft MVP’s and Partners as well as VMware experts, we are summoned by companies all over the world to fine-tune and problem-solve the most difficult architecture, infrastructure and network challenges.

And sometimes we’re asked to share what we did, at events like Microsoft’s PASS Summit 2015.

Awards & Certifications

Microsoft Partner       Insights Sccess Award    Technology Headlines Award    Golden Bridge Gold Award    CIO Review Top 20 Azure Solutions Providers    VMWare Partner
Microsoft Certified Master    Microsoft MVP