Pythian Blog: Technical Track

SQL Server and OS Error 1117, Error 9001, Error 823

small__3212904193 Along with other administrators, life of us, the DBAs are no different but full of adventure.  At times, we encounter an issue which is very new for us, rather, one that we have not faced in the past.  Today, I will be writing about such case.  Not so long back, in the beginning of June, I was having my morning tea I got a page from a customer we normally do not receive pages from. While I was analyzing the error logs, I noticed several lines of error like the ones below:

2014-06-07 21:03:40.57 spid6s Error: 17053, Severity: 16, State: 1.
LogWriter: Operating system error 21(The device is not ready.) encountered.
2014-06-07 21:03:40.57 spid6s Write error during log flush.
2014-06-07 21:03:40.57 spid67 Error: 9001, Severity: 21, State: 4.
The log for database 'SSCDB' is not available. Check the event log for related error messages. Resolve any errors and restart the database.
2014-06-07 21:03:40.58 spid67 Database SSCDB was shutdown due to error 9001 in routine 'XdesRMFull::Commit'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
2014-06-07 21:03:40.65 spid25s Error: 17053, Severity: 16, State: 1.
fcb::close-flush: Operating system error (null) encountered.
2014-06-07 21:03:40.65 spid25s Error: 17053, Severity: 16, State: 1.
fcb::close-flush: Operating system error (null) encountered.

I had never seen this kind of error in the past so my next step was to check Google , which returned too many results. There were two sites that were worthwhile: The first site covers the OS Error 1117 , a Microsoft KB article, whereas the second site by Erin Stellato ( B | T ) talks about other errors like Error 823, Error 9001.  Further, I checked the server details and found that it’s exactly what the issue is here,  the server is using  PVSCSI (Para Virtualized SCSI) controller to LSI on the VMWare host. 

Resolving the issue

I had a call with client and have his consent to restart the service. This was quick, and after it came back, I ran checkdb – “We are good!” I thought.

But wait. This was the temporary fix. Yes, you read that correctly. This was the temporary fix, and this issue is actually lies with the VMWare, it’s a known issue according to VMWare KB Article. To fix this issue, we’ll have to upgrade to vSphere 5.1 according to the VMWare KB article.

Please be advised that the first thing that I did here is to apply the temporary fix, the root cause analysis – I did that last, after the server is up and running fine.

photo credit: Andreas.  via photopin CC

No Comments Yet

Let us know what you think

Subscribe by email