These days, shrinking IT budgets and pressure to maintain the infrastructure with fewer resources seems to be the norm. If you have SCOM in your environment, you can save time and resources by utilizing simple problem resolution tasks or building more advanced methods to automate reoccurring infrastructure maladies. In Part 1 of this series, we describe the approach behind implementing a time‐saving process to preemptively automate a known server issue before users are impacted. The remainder of the solution will be described in more detail in next month’s newsletter.
Regardless of the infrastructure size or complexity, most IT professionals see immediate benefits of automation through scripting. Some companies attempt to self‐heal issues as they arise however this could mask underlying issues within the environment. Understanding the subtleties of issues that can be automated should be thoroughly analyzed prior to implementing a self‐healing task. Putting it all together typically requires up front planning and falls within your companies policies.
Task Process Envisioning
Before automation can be developed, we must have a complete understanding of the problem space and have forethought into what the desired outcome might be. In this scenario, our fictitious company has a LOB application that periodically stops working when the hard drive runs out of space. Based on previous knowledge of this issue, the IIS logs are the root cause of disk space deprivation. The old IIS logs can be safely deleted however they must be retained for a minimum of five days to satisfy corporate policies. This is enough information to begin the preliminary envisioning and readiness checks.
Since automation can be achieved in any number of ways, it’s important to consider which scripting language is best for your organization. We believe the best scripting language is the one your IT pros are most proficient using. Since we are using SCOM to initiate the script, we need to ensure the environment can support our idea. In addition to selecting a scripting language, we also need to organize our information about the application and server components. Here’s what we know:
- We are using SCOM R2 which natively supports vbscript and PowerShell
- The LOB application servers are Windows 2008 R2 with IIS 7
- We have chosen to use a PowerShell script and use the IIS 7 snap‐in
- We don’t know if the servers have PowerShell configured correctly
It looks all good however, we need to first validate the PowerShell configuration on our application servers. With a few minutes of preparation, we can use a process resembling the illustration below to ascertain if the servers are capable of supporting the IIS log cleanup script.
Other Task Design Considerations
Now that we determined the scripting language and verified the servers have all the necessary prerequisites, there are still a few more questions which must be resolved. During the design phase, ask yourself these general questions:
- Do my tasks involve running a script or command?
- What scripting engines do I have available on the servers?
- Are my commands available on all servers (such as “net start”)?
- When should automation to run (on schedule, triggered on failure, etc.)?
- Where automation should be run from (agent or console)?
- Does my script have any dependencies or version requirements?
Make a mental note of the operating system and related application versions as this could impact the way in which you design your process. Authoring a script that must work across operating systems and server components can add complexity.
Running The Task Manually, As a Recovery or Scheduled Rule?
Based on the scenario outlined above, we know the problem happens about once per week and can manifest at any given time. Although we could implement a recovery script when low disk space is detected, it would be easier to run the IIS Log cleanup script once per day as a SCOM rule.
Stay tuned for part two in the series to find out how we automated the IIS Log cleanup using PowerShell and a configurable SCOM rule.