Friday, November 13, 2015

Data Warehouse: Don’t Forget To Modify The Retention

Case
Even though quite a few postings have been written about it, still many people don’t realize the full impact of the default retention settings of the SCOM/OM12x Data Warehouse database. Okay, they know the data is kept for 400 days before it’s groomed out. But many times this is where it stops. And that’s too bad. Why? Just keep on reading Smile.

No need to repeat others…
Since Kevin Holman has written an excellent posting about the Data Warehouse retention and grooming, there is no need to repost/rewrite it here. And yes, the posting is written for SCOM 2007x but applies to OM12x as well.

For me the 3 key takeaways of that posting are:

  1. The Data Warehouse database will be one of the LARGEST databases supported by a company;
  2. You should give STRONG consideration to reducing your warehouse retention to your reporting REQUIREMENTS;
  3. If you don't have any – MAKE SOME!

I would like to follow up on the third item but I don’t know what dataset to ‘touch’…
Since many people/companies don’t follow up on those advises, the Data Warehouse becomes very big. As a result, backups/restores of the same database becomes a challenge.

So what to do? Easy. Modify the retention settings in the Data Warehouse database so it adheres to your company policy and compliancy requirements.

But as it turns out, all of the sudden some road blocks do pop up. Because people are having a difficult time by deciding what kind of data should be kept for 400 days or not.

Some suggestions of what kind of datasets to modify
Suppose your company has the requirement to run reports which show a maximum of ONE year of data for performance and availability.

I’ve seen many cases like these where the SCOM admins say: Okay, based on those requirements there is no need to modify the retention settings since the difference between 365 days (a year) and 400 is way too small to go through the hassle.

So far so good. BUT there is one HUGE downside. YES, your company wants to run reports back to a maximum of ONE year. So you must keep that data there for at least ONE year. BUT does your company require reports dating a year back containing HOURLY data?

Because that’s the default! HOURLY data (performance & availability) are kept for 400 days. And it goes without saying those very same datasets get VERY big.

Until now I haven’t met one company with that requirement. Yes, they require reports going one year back. But in most cases those reports are allowed to use DAILY aggregations, not HOURLY.

So when you’re going to ask your company what kind of policy they adhere to when it comes to data retention, make sure they not only say for how many days, but also to what kind of data (daily or hourly) it applies to.

And until now, this is what most of the companies I’ve met require:

  • DAILY data: 1 year (leave it to the default of 400 days);
  • HOURLY data: 100 days (a difference of 300 days with the default setting of 400 days!)

And now you’ve got a WHOLE different kind of story. And while you’re at it, don’t forget the Alerting dataset as well. Since here the data is kept for 400 days as well. And most of the times the half or even less is good enough.

Back to Kevin it is
With that information read Kevin’s posting about how to modify the retention settings on the related data sets and your Data Warehouse database will shrink considerably in size!

Teaser
In another blog posting I’ll share some tips and queries about how to troubleshoot data aggregation issues in the Data Warehouse database.

No comments: