Topic
When a database is corrupted, we will alwayssuggest restoring last known good database backup instead of running DBCCCHECKDB command to fix it. Can you explain why? Try to use a demo to show whyDBCC CHECKDB command is not always a first choice for a corrupted database.
Answer:
REPAIR_ALLOW_DATA_LOSS
· The purpose of repair is not to save user data.The purpose of repair is to make the database structurally consistent as fast as possible(to limit downtime) and correctly (to avoid making things worse).
· This means that repairs have to be engineered tobe fast and reliable operations that will work in every circumstance. Thesimple way to do this is to delete what's broken and fix up everything thatlinked to (or was linked from) the thing being deleted - whether a record orpage.
· This means that repairs have to be engineered tobe fast and reliable operations that will work in every circumstance. Thesimple way to do this is to delete what's broken and fix up everything thatlinked to (or was linked from) the thing being deleted - whether a record orpage.
· Things to look for that mean repair won’t be ableto fix everything are:
- CHECKDB stops early and complains about system table pre-checks failing (errors 7984 – 7988 inclusive)
- CHECKDB reports any metadata corruption (8992, 8995 errors)
- CHECKDB reports any errors on PFS page headers (8939, 8946 errors with a possible 8998 error as well)
Test1: Damage the data page
--create a test database
usemaster
createdatabase testcheckdb
usetestcheckdb
--create table t3
droptable t3
createtable t3
(
c1 int primarykey,
c2 nvarchar(50)
)
--insert a row (0, 'ABCDEFG')
select* from t3
insertinto t3 values(0,'ABCDEFG')
go
--insert 1000 rows
declare@count int
set@count = 1
WHILE@count <= 1000
BEGIN
insert into t3values(@count,'abc')
set@count = @count+1
END;
GO
--check data
select* from t3
--check page
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//154-IAMpage;155-index root page;153,156,157-data page
DBCCpage('testcheckdb', 1,154,3)--//
DBCCpage('testcheckdb', 1,155,3)--//
DBCCpage('testcheckdb', 1,153,3)--// m_slotCnt = 351
DBCCpage('testcheckdb', 1,156,3)--// m_slotCnt = 352
DBCCpage('testcheckdb', 1,157,3)--// m_slotCnt = 298
--set db offline
usemaster
alterdatabase testcheckdbsetoffline
--modify the data page
alterdatabase testcheckdbsetonline
--check data again
usetestcheckdb
select* from t3
• SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x54781bb2;actual: 0x54781c32). Itoccurredduring a read of page (1:153) in database ID 12 at offset0x00000000132000 in file 'C:\Program Files\Microsoft SQLServer\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\testcheckdb.mdf'. Additional messages in the SQL Server errorlog or system event log may provide more detail. This is a severe errorcondition that threatens database integrity and must be corrected immediately.Complete a full database consistencycheck (DBCC CHECKDB). This error can be caused by many factors; formore information, see SQL Server Books Online.
USEtestcheckdb
DBCCCHECKDB(12)
GO
usemaster
ALTERDATABASE testcheckdbSETSINGLE_USER;
GO
DBCCCHECKDB(12, REPAIR_ALLOW_DATA_LOSS)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO
Repair: TheClustered index successfully rebuilt for the object "dbo.t3" indatabase "testcheckdb".
Repair: Thepage (1:153) has been deallocated from object ID 2105058535, index ID 1,partition ID 72057594038779904, alloc unit ID 72057594039697408 (type In-rowdata).
Msg 8945,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1 will be rebuilt.
The error has been repaired.
Msg 8928,Level 16, State 1, Line 1
Object ID2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data): Page (1:153) could not be processed. See other errors for details.
The error has been repaired.
Msg 8939,Level 16, State 98, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data), page (1:153). Test (IS_OFF (BUF_IOERR,pBUF->bstat)) failed. Values are 12584969 and -4.
The error has been repaired.
Msg 8980,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data). Index node page (1:155), slot 0 refers tochild page (1:153) and previous child (0:0), but they were not encountered.
The error has been repaired.
Msg 8978,Level 16, State 1, Line 1
Table error:Object ID 2105058535, index ID 1, partition ID 72057594038779904, alloc unit ID72057594039697408 (type In-row data). Page (1:156) is missing a reference fromprevious page (1:153). Possible chain linkage problem.
The error has been repaired.
CHECKDBfound 0 allocation errors and 4 consistency errors in table 't3' (object ID2105058535).
CHECKDBfixed 0 allocation errors and 4 consistency errors in table 't3' (object ID2105058535).
CHECKDBfound 0 allocation errors and 4 consistency errors in database 'testcheckdb'.
CHECKDBfixed 0 allocation errors and 4 consistency errors in database 'testcheckdb'.
usetestcheckdb
--check data again
select* from t3
--check page again
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//former: 154-IAMpage;155-index root page;153,156,157-data page
--//now: 159-IAMpage;169-index rootpage;158,168-data page
DBCCpage('testcheckdb', 1,159,3)--//
DBCCpage('testcheckdb', 1,169,3)--//
DBCCpage('testcheckdb', 1,158,3)--// m_slotCnt = 352
DBCC page('testcheckdb', 1,168,3)--// m_slotCnt = 298
Test2: Damage the PFS page
--//check the PFS page
DBCCpage('testcheckdb', 1,1,2)-
--then online the db , selectand insert.
--check the data, valid.
usetestcheckdb
select* from t3
--insert a row, valid
insertinto t3 values(1001,'ABCD')
--insert another 1000 rows, just 52 rows inserted ok. The following failed
declare@count int
set@count = 1002
WHILE@count <= 2000
BEGIN
insert into t3values(@count,'abc')
set@count = @count+1
END;
GO
Thestatement has been terminated.
Msg824, Level 24, State 2, Line 5
SQLServer detected a logical consistency-based I/O error: incorrect checksum(expected: 0x20c60630; actual: 0x80660ace). It occurred during a read of page(1:1) in database ID 12 at offset 0x00000000002000 in file 'C:\ProgramFiles\Microsoft SQLServer\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\testcheckdb.mdf'. Additional messages in the SQL Server errorlog or system event log may provide more detail. This is a severe errorcondition that threatens database integrity and must be corrected immediately.Complete a full database consistency check (DBCC CHECKDB). This error can becaused by many factors; for more information, see SQL Server Books Online.
--then try to check page
dbcctraceon(3604)
DBCCind('testcheckdb', t3,-1)
--//former: 154-IAMpage;155-index root page;153,156,157-data page
--//now: 159-IAMpage;169-index rootpage;158,168-data page
dbcctraceon(3604)
DBCCpage('testcheckdb', 1,159,3)--//
DBCCpage('testcheckdb', 1,169,3)--//
DBCCpage('testcheckdb', 1,158,3)--// former m_slotCnt = 352, now m_slotCnt = 352
DBCC page('testcheckdb', 1,168,3)--// former m_slotCnt =298, now m_slotCnt = 351
--checkdb
DBCCCHECKDB(12)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO
|
--try to repair, fail
DBCCCHECKDB(12, REPAIR_ALLOW_DATA_LOSS)
WITHALL_ERRORMSGS,NO_INFOMSGS
GO
|