Tuesday, 2 March 2010

Citrix XenApp Performance Monitoring/General "Server" Performance monitoring

Hi All,

As part of the current project I have been working on I was asked to put together a list of performance counters which should be setup to monitor the performance of XenApp in depth. The contract which I am uses EdgeSight heavily but I wanted to make sure that all was covered in relation to performance. So in OpsMgr we set up the following and I though that this may be a useful reference for anyone wanting to monitor XenApp or a servers general performance...enjoy:

Kernel

Memory\Free System Page Table Entries

This will show the number of page table entries not currently used by the system. On a x32 Windows Server 2003, 4Gb RAM, this counter should not drop below 5000 Free System Page Table Entries as this would indicate that we are running low on kernel memory.

Logical Disk

% Disk Time

Gives an indication of how busy the disks are. The disk can become a bottleneck for a number of reasons:

- The server has too little physical memory so is “thrashing.” If thrashing is occurring, the pages/sec will also be high.

- A single user is running an application or process that makes extensive and rapid use of the disk. This can be investigated by running Current Process and Current User reports in the future.

- Many users are performing large amounts of disk activity. The speed of the disks may be the server’s bottleneck.

The metric % Disk Time is calculated using a number of factors and values above 100% are possible. If we see values of 100% disk time, the disk is in constant use. Values greater than 100% may indicate that the disk is too slow for the number of requests.

% Free Space

The server is running out of disk space. Several factors can cause this:

- A lack of remaining disk space after installing the operating system and applications.
- A large number of users are logged on (now or in the past) and their configuration data, settings, and files are taking up too much space.
- A rogue process or user is consuming a large amount of disk space.

Current Disk Queue Length and Average Disk Queue length

This counter shows the number of requests outstanding on the disk at the time that the performance data is collected. For this counter, lower values are better. Values above 2 per disk may indicate a bottleneck. This counter gives the queue length across all disks in use (so if they are allocated via LUN and this has 4 disks then a disk queue length of 8 might be ok). Bottlenecks can create a backlog that can spread beyond the current server that is accessing the disk, and result in long wait times for users.

Logical Disk: Avg. Disk sec/Read and Logical Disk: Avg. Disk sec/Write

These counters show the average time, in seconds, of a read or write operation to the disk. Typically these are used to monitor SQL performance but it may be useful to see what times are being produced. From a SQL perspective the following is a ruff guide to disk performance:

- Avg. Disk Sec/Read - Measure of disk latency. Avg. Disk sec/Read is the average time, in seconds, of a read of data from the disk. More Info:Reads Excellent <> 20 Msec ( .020 seconds )

- Avg. Disk sec/Write - Measure of disk latency. Avg. Disk sec/Write is the average time, in seconds, of a write of data to the disk.

Non cached Writes:Excellent <> 20 Msec ( .020 seconds )

Cached Writes OnlyExcellent <> 04 Msec ( .004 seconds)

Memory

Available Bytes

Informs you if too much memory is being used. This could be because:
- Too many users are logged on.
- The applications that users are running are too memory-hungry for the amount of memory available on the server.
- Some user or process is using a large amount of memory. Running a perfmon report focussing on Current Process report may help you track this down.

Being short on memory could result in “thrashing.”

Pages/sec (Hard Page Faults)

A large amount of paging indicates either:
- The system is low on physical memory and the disk is being used extensively as virtual memory. This can be caused by too many users being logged on, too many processes running, or a rogue process “stealing” virtual memory.

—or—

An active process or processes are making large and frequent memory accesses.

Too much paging degrades the performance of the server for all users logged on. The Available Bytes, Disk, and % Processor Time metrics may also enter warning or critical states when a large amount of paging occurs. Short bursts of heavy paging are normal, but long periods of heavy paging seriously affect server performance.

It is generally agreed that anything over 20 Pages/sec could be deemed as a bottleneck.

Page File: % Usage

While the Page File is less likely to be a bottleneck it is worth checking. Anything over 70% could indicate an issue.

Memory: Pool Non-paged Bytes

A good counter to detect an application memory leak. When the application is closed memory should be returned. If this counter begins to creep up at a rapid rate during the day then faulty application may be causing a memory leak.

Committed Bytes

If committed bytes are higher than the amount of physical RAM then this would indicate a server with not enough memory.

Network

Bytes Total/Sec

This metric gives a good indication of how much network activity this server is generating or receiving. Thresholds here are dependent on a number of variables such as network link, NIC hardware etc.

Processor

% Interrupt Time

The processor is spending a large amount of time responding to input and output rather than user processing. A large value for interrupt time usually indicates a hardware problem or a very busy server.

% Processor Time

A high processor time for a long period of time indicates that the processor is the bottleneck of the server, too many users are logged on, or there is a rogue user or process (use the Current Process perfmon counters to investigate). This is pure processor utilisation, not always a bad thing when it's 80-90%. User experience and helpdesk calls would need to correlate to this for this to be deemed the issue.

System

Context Switches/Sec

A large number of threads and/or processes are competing for processor time. There are a number of variables which could be used to determine thresholds such as number of processors and clock rate. Thresholds I have seen mentioned in the past are around 15000 per a CPU.
% Interrupt Time

Terminal Services

Active Sessions

A large number of users are logged on and running applications. The server may begin running out of memory or processor time and performance for users may deteriorate.

Inactive Sessions

A large number of disconnected sessions are taking virtual memory. Remove some disconnected sessions or reduce the length of time for which disconnected sessions can persist until they are automatically removed.

Citrix Metaframe Presentation Server

Data Store Connection Failure

Thresholds here are dependent on WAN/LAN etc but this value should be low.

Tuesday, 21 July 2009

Operations Manager 2007 R2 ROI

Hi All,

Recently I was asked to help a customer write a high level Return of Investment summary for the implmentation of Operations Manager 2007 R2. I had a look on the web and found many points for Config Mgr (which are useful) but not too much on Operations Manager 2007 R2 from a high level business perspective. So here is my take on why an organisation should look at Operations Manager 2007 and what the possible Return on Investment could be:

Operations Manager 2007 R2 Return on Investment

Return of Investment on system/service monitoring solutions can be based on tangible, usually financial benefits, or intangible, soft focused benefits such as increased infrastructure, application and service monitoring knowledge within an organisations support team. Usually tangible benefits in implementing Operations Manager 2007 R2 focus on cost savings in regards to the following areas:

  • License Cost – Operations Manager 2007 R2 with Microsoft’s flexible licensing suites including SMSE (System Center Enterprise Licensing) and sector focused pricing means that the cost to implement any product in the System Center family can be extremely cost effective as licensing is based on the host processor and not each individual O.S to be deployed. As System Center is usually part of an organisations existing volume license and/or software assurance agreement then considerable cost savings can be made when comparing to other enterprise management suites of IBM Tivoli TEC and HP OpenView etc.
  • License Savings – Operations Manager 2007 R2 has the capability of being an end to end service monitoring platform meaning that consolidation of current point system monitoring tools is probable resulting in possible license savings. This also means that centralisation of core service monitoring.
  • Service Outages – Failures to a core customer focused service can cause costs in terms of specialist knowledge from a third party required to fix the issue. Overall costs can be incurred by the organisation if the outage is for a sustained amount of time. Operations Manager 2007 R2 provides proactive monitoring meaning that potential issues which can cause a major service failure can be captured and addressed before a major issue that causes a service failure occurs. Each Management Pack (set of rules) which can be deployed with Operations Manager includes expert related knowledge directing a support operative to the potential cause and resolution of an issue.
  • Manual service checks – In many developing IT infrastructures a support member(s) role may be to do daily, weekly or monthly checks on core services. Operations Manager 2007 R2 allows for automation of these checks and remedial recovery if there is an issue.


The intangible benefits to an organisation implanting Operations Manager 2007 R2 can be numerous and include:

  • Knowledge based support – All Operations Manager 2007 R2 Management Packs are inbuilt with expert knowledge from product vendors meaning that if an application issue was to occur, clear summary of the problem, cause and possible resolution information is presented to the operator thus empowering them to solve the issue in a guided manner.
  • Time To Resolve (TTR) – Time To Resolve refers to the point when an issue has occurred to when it is solved and the service can continue, in line with the service agreement, to the end customer or end user. As Operations Manager 2007 R2 provides knowledge based support this means that Time To Close of issues can be greatly reduced.
  • Simplifying complexity – As an organisations infrastructure to support its process and infrastructure it can become a complex mesh of systems which can make it difficult for an non IT professional to see the benefit of the infrastructure deployed in relation the business. Operations Manager 2007 R2 simplifies this complexity with the ability to create Line of Business (LOB) views which clearly show the relationship of various system components to a business process.
  • Real Time Service Level Objective (SLO) Monitoring – Operations Manager 2007 R2 allows the operator to easily link specific, percentage based service level objectives core Line of Business services or specific application components such as databases. This makes it easier to see the benefit of I.T systems to the organisation and can allow for information to be used in Business Process Improvement (BPI) or Business Process Restructuring (BPR) initiatives.
Hope these help. There are many more, espeically as there are many great features of Operations Manager 2007 R2 but these a quite a good high level summary of the potential return from an Operations Manager 2007 R2 deployment.

Thanks,

John

Wednesday, 27 May 2009

Operations Manager 2007 R2 180 day eval available for download

Hey All,
Just to let you know that last Thursday, 21st May 2009, Operations Manager 2007 R2 180 day evaluation is available for download from http://www.microsoft.com/downloads/details.aspx?FamilyID=93ddf25b-1ef0-4851-81b0-5fb9a2f76181&DisplayLang=en . At the time of writing there is no full version download available from MSDB subscriber downloads.

So what is all the fuss about? Well the new version (which requires new licensing unless you have Server Management Suite Enterprise Licensing with Software Assurance) has the following enhancements:

- Improved stability and performance primarily for the Operations Manager 2007 console.
- Native cross platform monitoring of Unix and Linux.
- Improved notfication configuration.
- Process monitoring.
- Wild card service monitoring.
- Intregrated service level monitoring and reporting.
- The console is now black.
- Power consumption for each computer or for a group of computers with the power consumption management pack.
- Service Level Dashboard v2.0 which integrates with Sharepoint Services or MOSS 2007.

There are many more enhancements and over the next couple of weeks I will be writing a review. Documentation can be found at http://technet.microsoft.com/en-us/opsmgr/bb498235.aspx . Also some new marketing datasheets - http://www.microsoft.com/systemcenter/operationsmanager/en/us/datasheets.aspx .
Some screenshots I quickly :-) captured:













Thanks,

Momski

Tuesday, 12 May 2009

OpsMgr Health Service fails to start after applying new QFE 957123?

This week I am doing an OpsMgr 2007 deployment with ACS including Virtual Machine Manager 2008, SecureVantages packs and hopefully BridgeWays Oracle Pack.

Picture the scene.... the customer, OpsMgr console looking great with me evangelising how good it is, radio on (relaxed atmosphere) kicking out some great 80s music (Prince – Little Red Corvette if you must know) and I am as happy as can be as the mighty Blackburn Rovers have reached the magic 40 points so no relegation worry......I know, I know you all want my job....I would not swop it though for a million dollars*

And then it’s that time – time to import the IIS Management Pack. Well I am sure those of you who read the forums or have gone through the pain of trying to monitor IIS 7 using the new IIS Management Pack are aware of the “colourful” issues presented here – especially with the original release of QFE 957123, Config and SDK service failing to start anyone????? Now this was captured early by the IIS Management Pack team and a new version
released:

http://www.microsoft.com/downloads/details.aspx?FamilyID=f82ef6b3-a08e-4295-b9e3-fb78a74aefa8&displaylang=en

So I thought no issues – great! Oh OpsMgr how you surprise me! I applied the hotfix to the RMS and guess what the OpsMgr Health Service failed to start – great and presented me with this nice error:

"The OpsMgr Health Service service terminated with service-specific error 3(0x3).

"Customer - “That will be an issue then?” Me – “Yes, don’t worry (%$#)”

So I decided to have a look at http://blogs.technet.com/smsandmom/archive/2008/04/30/opsmgr-2007-healthservice-service-fails-to-start-with-25362-warning.aspx

Sure enough this put me on the right track...so to cut a long story short , to fix:
(1) Open Registry Editor
(2) Navigate toHKLM\SYSTEM\CurrentControlSet\Services\HealthService\Parameters
(3) In the State Directory (REG_SZ) key make that there is an entry to:

"drive"\Program Files\System Center Operations Manager 2007\Health ServiceState

Where drive is your install drive.

I hope that helps people out there!

Thanks,

Momski
http://www.mmsnews.info

* Disclaimer: If you do wish to offer Momski a million dollars then this site does not take any liability in him saying yes as he is not crazy!

Sunday, 26 April 2009

Momski at MMS 2009

Hey All,

Firstly....I might aswell get this over with. I have just slapped my hands as it has been ages since posting but my job role has changed in recent months and I am now Head of Service Management. Finding time to blog has been reall hard - sorry. This week I am at MMS 2009 and I have set up the following social group to keep track of events as they happen http://www.mmsnews.info sign up and contribute. I will be blogging on her during the week and will be putting up video interviews with some of the major vendors and getting their feedback and analysis. Also make sure you start following http://twitter.com/mmsnews for the latest gossip, news and update as they happen.

Regards,

John

Thursday, 20 November 2008

System Center Operations Manager 2007 R2 beta

The System Center Operations Manager 2007 R2 beta is now available to download.

http://technet.microsoft.com/en-us/opsmgr/dd239186.aspx

For those of you at TechEd this year you will have heard the announcement that instead of an SP2 next year there will be an R2 release because of basically the number of enhancements and features does not warrant just a rollup service pack.

First question is what does R2 mean in terms of licensing? i.e. can I upgrade from OpsMgr SP1 to R2. The answer is technically yes but it is my understanding that this will not be free upgrade unless, of course, you have purchased SA (Software Assurance) as an R2 release is viewed as a new version of the product.

So hopefully in the next couple of weeks I am going to try and set aside some time to do a test install and post back my findings but highlights of this release will be:

Cross Platform Monitoring

Extends Operations Manager to perform detailed monitoring of non-Microsoft operating systems, and workloads running on those systems.
  • Integrated experience for discovery of systems to be monitored; whether Windows, Unix, Linux or other.
  • Addition of Unix/Linux servers into the device management node of the administration pane.

Service Level Tracking

  • Presents a detailed report on performance and availability metrics of IT service levels for all monitored IT services.
  • Allows granular definition of how an IT service will be monitored; allowing definition of service level objectives that can be targeted against different objects that are to be monitored.
  • Delivers the capability to report and surface information through dashboards, such as Microsoft SharePoint.

New Monitoring Templates

Monitoring templates aid in the creation of monitoring capabilities by providing a predefined set of monitors.

  • New process monitoring template, enabling you to monitor for a minimum number of processes running on a system, set a threshold to identify when the number of processes exceeds a defined number, and define policy that can terminate an undesired process should it be identified on a monitored system.
  • Enhanced OLEDB template, allowing operators to identify the database and set thresholds for connection, query and fetch times, as well as run custom queries against a database.
  • Improved Windows Service template, enabling wildcard entry for selection of multiple, similarly-named services.
  • New templates for monitoring the existence of log files on Unix and Linux systems.

Enhanced Usability

  • New integrated import wizard for management packs (MPs) allows you to brows the MPs available on the MP catalog, and automatically download and install those MPs that you select.
  • Enhanced notification subscription wizard, simplifying the creation and maintenance of notifications. New features include allowing operators to create notifications directly from the alert view, and to more easily create targeted and condition-based notifications.
  • New overrides summary view, allowing you to view all overrides across both sealed and unsealed management packs.
  • New ability to view the health explorer via the web console.
  • One-click maintenance mode, ensuring that all necessary and related monitors suspend their respective monitoring at the same time.

Improved Performance

  • Enhanced console delivers significantly improved response times compared to earlier versions of the product.
  • Improved scalability of URL monitoring, delivering monitoring of over 1000 URLs per management server.
  • Adds full support to install on Microsoft SQL Server 2008.

It's beginning to feel a lot like..........

.....the busiest time of the year!

Wow....time flies when you are really busy. Looks like I have been neglecting my blog! November seems to have just flown by! In the last month I have been deploying Operations Manager 2007 on Windows 2008 and also deploying VMM to work in conjunction with it. So far things are going ok but I have hit into issues with the DHCP and DNS management packs. Basically it seems that the monitoring/discovery scripts keep launching on the agent server causing memory to be eaten by the cscript process like it's going out of fashion! Any how Microsoft are aware of these issues and last week a new DHCP pack was released and this seems to have done the trick....just the DNS one now please Microsoft.

Yesterday Dilgenter had their annual Symposium event at the Science Museum In London. I did a keynote presentation in the afternoon on System Center Roadmap and overall it was a real success with over 40 people attending. I was really encouraged to see the interest in the System Center family and with success of an event it hopefully means more projects which is great so I am planning to follow up initially with a series of webinars over the next couple of months on OpsMgr, VMM 2008, OpsMgr 2007 R2 and ConfigMgr....oh yeah and maybe fit some Christmas shopping in at some point!!!