4

Troubleshooting SQL VCDB Growth

Bloated VCDB Database

 

The vCenter Database (VCDB) is a lean, mean, data collection machine and it’s what makes your vCenter YOUR vCenter. The VCDB stores things like events, statistics, licensing, and configuration information for your environment. Under normal conditions your VCDB should only be a few GB depending on the size of your environment, but occasionally we see some bloating. A little bloating is okay and in fact is normal, same as with people (especially around the holidays). But what happens when your VCDB becomes so gluttonous that it grows over 50GB or even 100GB? You’ll start to notice vCenter becoming extremely sluggish and can take a long time for your inventory to show up after a reboot. This is a scary sight at first…

empty vm inventory

In this example we’ll take a look at a VCDB that grew to nearly 200GB in vCenter 6! I’m going to show you how to troubleshoot this extreme growth and bring your VCDB back to a more manageable size.

Identifying VCDB bloating:

First of all, we need to identify the problem. In this example the vCenter’s OS disk has nearly filled up and we need to find out why. WinDirStat and TreeSize are excellent utilities for showing where all that free space has gone. Just make sure you run either utility as an administrator or else you may not be getting the full picture.

WinDirStat results for VCDB

 

We’ve just identified the VCDB as the reason this vCenter’s OS disk is nearly full. This is why it’s recommended that you keep any SQL DB’s on their own volume or at least not on the OS volume.

Finding the bloat:

SQL Server Management Studio has a number of useful canned reports that we can use to identify what’s causing the VCDB to grow so large. First, let’s take a look at the database’s disk usage.

Open SQL Server Management Studio and connect to your VCDB instance. The server name will typically be VC1\VCENTER where VC1 is the name of your vCenter server.

SQL Server Management Studio Connection Settings

If your SQL Instance isn’t configured for Windows Authentication, try SQL Authentication using your vpxuser credentials that you set when you installed vCenter.

Expand your instance and the Databases folder. Right click on your VCDB and select the Disk Usage report.

SQL Server Management Disk Usage Report

This report basically confirms what we saw in WinDirStat. There are only 2.2GB of log files (VCDB.ldf) and the database is 193GB!

VCDB Disk Usage Report

Let’s take a look at the tables to see which ones are the largest. Right click your VCDB and run the Disk Usage by Top Tables report.

Disk Usage By Table

Wow!! 280,600,000 records in the events table!! Better take a look at what’s being logged. Before you decide to get fancy and query the VPX_EVENT_ARG table, just remember that according to William Lam a kitten dies every time you query the VCDB! William is a smart guy so we’d better listen to him! Seriously, there’s no reason to query the table. The events are stored in XML and there are much more friendly interfaces such as the vSphere client or PowerCLI (using the Get-VIEvent command) to get this information.

Looking at the events in the vSphere client, there appears to be an issue with one of our hosts which is causing dozens of events to be logged every second.

Client Events

Thankfully, this issue was resolved with a simple host reboot.

Looking back at our disk usage by table report, we see a lot of statistical information being stored as well. Let’s take a look at our logging levels…

Statistics Logging Level 3

This vCenter is configured for Level 3 statistics logging. Look at the estimated space required!

The Statistics Level Is Too Damn High!

Statistics Logging Level 1

That’s more like it! Level 1 is the default statistics logging level and requires much less space.

Cleaning up the VCDB:

We have two options now that the offending host has been dealt with… We could leave the VCDB alone and let it purge itself over time, or we could clean it up. Since you’re not reading this to learn to do nothing, let’s clean it up!

WARNING! Before performing ANY maintenance on the VCDB, take a FULL BACKUP of the database!!

Cleaning up the events table:

Configure vCenter’s task and event retention to one day. You can either configure this in the vSphere client…

1 Day Retention

Or follow the instructions in KB1025914 to edit the event.maxAge and task.maxAge values.

Note: If you haven’t already stopped the vCenter Server services, do so now.

Event Max Age

From here, the easiest thing to do is to wait for a day and let the VCDB purge the events on its own. However, we’ll proceed to follow the steps in the aforementioned KB to clean things up now.

In your SQL Server Management Studio console, navigate to Databases -> VCDB -> Programmability -> Stored Procedures. Right click on dbo.cleanup_events_tasks_proc and click Execute Stored Procedure.

Clean up events and tasks procedure

Hold tight! This may take some time to complete. It took an hour to run for this example. This would be a great time to go grab a coffee!

After the stored procedure has been executed, let’s see what it did by running our reports again.

SQL Server Management Studio Disk Usage After

Not too shabby. We freed up about 50GB or 26% of the overall DB.

SQL Server Management Studio Disk Usage By Table After

We also only cleaned up 77 million event records. That’s a lot but we still have 204 million records remaining. Don’t fret. Remember, we only cleaned out all but one day’s records. Since we just corrected our issue it will take another 24 hours to purge the remaining data.

Cleaning up the historical statistics:

This is optional as it will delete your historical statistics. Unlike the tasks and events, statistical data will take longer to purge on its own because of the retention rules but we’ll speed things up here. KB1007453 tells us how to clean up the stats tables, however it’s a bit cumbersome so I’ll show you how to put SQL to work for you!

We need to truncate the statistics tables. This was simpler back in the 5.0 days, but in 5.1 VMware added hundreds of tables that need to be truncated, depending on how much data is being stored. The KB implies each table should be truncated one by one, but there’s an easier way.

In SQL Server Management Studio, right click VCDB and select New Query. Paste the following into the query and hit Execute (F5).

This query will truncate all the VPX_HIST_STAT tables, which in this example there are 545!


DECLARE @intTbl INT
DECLARE @intHSNum INT
DECLARE @intTblqty INT
DECLARE @tablename varchar(15)
DECLARE @trunctable varchar(100)
SET @intHSNum = 1
TRUNCATE TABLE VPX_SAMPLE_TIME1
TRUNCATE TABLE VPX_SAMPLE_TIME2
TRUNCATE TABLE VPX_SAMPLE_TIME3
TRUNCATE TABLE VPX_SAMPLE_TIME4
WHILE (@intHSNum <5)
BEGIN
SET @intTbl = 1
SET @tablename = 'VPX_HIST_STAT' + CAST(@intHSNum AS VARCHAR) + '%'
SET @intTblqty = (SELECT COUNT(*) from information_schema.TABLES WHERE TABLE_NAME LIKE @tablename
WHILE (@intTbl <@intTblqtySET )
BEGIN
SET @trunctable = 'TRUNCATE TABLE VPX_HIST_STAT' +CAST(@intHSNum AS VARCHAR) + '_' + CAST(@intTbl AS VARCHAR)
PRINT 'Truncating ' + @tablename
EXEC (@trunktable)
SET @intTbl = @intTbl + 1
END
SET @intHSNum = @intHSNum + 1
END
GO

We’ve now freed up another 10GB!

SQL Server Management Studio disk usage after truncating statistics tables

The last step in the KB is to run the rollup jobs. Right click Past Day stats rollup and click Start Job at Step. Now do the same for the month, and week rollups.

SQL Server Management Studio Rollup Jobs

After running the rollup jobs, let vCenter sit overnight to purge the remaining events and tasks. We’ve freed up a lot of space already so there’s no reason to rush things at this point. Don’t forget to start the vCenter Server services! Your backup admin will thank you!

Now for the moment of truth…

Take a look at all that beautiful whitespace!!!

SQL Server Management Studio Disk Usage Final

From here you can decide whether you want to shrink the VCDB or not. Shrinking causes fragmentation, but I really don’t want to leave the VCDB holding on to 200GB if it doesn’t need it. To shrink the database, right click the VCDB, select Tasks, Shrink, Database.

Shrink VCDB

The database shrink can take some time to execute. You can shrink the database while the vCenter Server Services are running, but it’s not a bad idea to stop them if you can afford to.

Once the shrink is done, so are we. Take a look at the database size now! Wow!

VCDB size after cleanup

If you stopped the vCenter Server services, go ahead and start them again. You’ll notice now that the inventory populates much faster now! Don’t forget to set your task and event retention back to their original values.

Matt Bradford

4 Comments

  1. –Corrected the SQL script:

    DECLARE @intTbl INT
    DECLARE @intHSNum INT
    DECLARE @intTblqty INT
    DECLARE @tablename varchar(15)
    DECLARE @trunctable varchar(100)
    SET @intHSNum = 1
    TRUNCATE TABLE VPX_SAMPLE_TIME1
    TRUNCATE TABLE VPX_SAMPLE_TIME2
    TRUNCATE TABLE VPX_SAMPLE_TIME3
    TRUNCATE TABLE VPX_SAMPLE_TIME4
    WHILE (@intHSNum <5)
    BEGIN
    SET @intTbl = 1
    SET @tablename = 'VPX_HIST_STAT' + CAST(@intHSNum AS VARCHAR) + '%'
    SET @intTblqty = (SELECT COUNT(*) from information_schema.TABLES WHERE TABLE_NAME LIKE @tablename)
    WHILE (@intTbl <@intTblqty )
    BEGIN
    SET @trunctable = 'TRUNCATE TABLE VPX_HIST_STAT' +CAST(@intHSNum AS VARCHAR) + '_' + CAST(@intTbl AS VARCHAR)
    PRINT 'Truncating ' + @tablename
    EXEC (@trunctable)
    SET @intTbl = @intTbl + 1
    END
    SET @intHSNum = @intHSNum + 1
    END
    GO

  2. Thank you for this great article, it saved me lots of time. We have a 2 hosts ESXi (essential plus) environment and it uses SQL Express (under 5 hosts / 50 vm’s) and it locks up at 10GB. Offcourse it happend to me while i was on holiday, the dBase gets full.
    With this article it saved me some time and i’ve managed to decrease my dBase from 10GB to 3,7GB.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.