TCP/IP KeepAlive, Session Timeout, RPC Timeout, Exchange, Outlook and you

Update June 21th, 2016 following feedback and a (true golden) blog post by the Exchange Team – Checklist for troubleshooting Outlook connectivity in Exchange 2013 and 2016 (on-premises) I’ve updated the recommended values for the timeout settings, and shortened the article overall for better reading. Do read the post in general, and in topic – check the CAS & Load Balancer configuration paragraphs.

Hi Again,

This post will spotlight networking considerations that are mostly overlooked. I’ve gathered a few of these issues that might brought you here searching for an answer:

  • Outlook is retrieving data from the Microsoft Exchange Server
  • The connection to Microsoft Exchange is unavailable. Outlook must be online or connected to complete this action
  • Sent items are stuck in Outbox or delayed
  • Outlook freezes or stuck when sending a message
  • Event ID 3033 regarding Exchange Server ActiveSync complaining about the most recent heartbeat intervals used by clients
  • Other strange / weird issues “but PING works! / telnet to the port works great!” – my personal favorite

The mentioned issues or symptoms could take place in any network environment, thus more common in complex network setups where multiple devices are protecting / route network traffic. Some typical configurations examples could be one of the following:

  • Outlook Anywhere or RPC over HTTP is being used, servers are protected or published by ISA / TMG / UAG / F5 / Juniper or any other reverse proxy / publishing solutions
  • Exchange servers are located behind a firewall, router or other network device
  • Clients / Remote clients are located behind a firewall, router or other network device (just to be clear on that…)
  • Exchange servers are being load-balanced with an external physical / virtual appliance

If you’ve read this post up until here and got disappointed because the above does not fit your issue, I’d like to suggest reviewing other RPC troubleshooting topics that might help Troubleshooting Outlook RPC dialog boxes – revisited or Outlook RPC Dialog Box Troubleshooting

Exchange Server traditionally (2000 to 2010) used MAPI over RPC to communicate “natively”, RPC is known to be “sensitive” and that’s why Exchange Server 2013 and beyond allows only Outlook Anywhere (RPC over HTTP) connections from clients which in my opinion is a great change that will simplify future deployments.

Client<>Server connections in general remains active while data “flows” , mails are sent/received etc. but when the connection is Idle, we might have a situation that it will be terminated. Here comes the term KeepAlive – a “dummy” packet that makes sure the connection remain active while no data is flowing and idle.

Here’s my “how-to” suggestion:

  • Configure the RPC timeout on Exchange servers to make sure that components which use RPC will trigger a keep alive signal within the time frame you would expect
    reg add "HKLM\Software\Policies\Microsoft\Windows NT\RPC" -v "MinimumConnectionTimeout" -t REG_DWORD -d 120
  • Consider modifying the server TCP/IP KeepAlive to reduce the chance of “IDLE” connections being terminated – (Default is Two hours – The recommended value is 30 minutes , and no less then 15 minutes) – this controls the OS TCP behavior with idle connections, could greatly improve responsiveness and scalability –
  • Make sure that you are aware of any router, firewall or any other network device that is placed between your clients and your servers. Once you do – note their session timeout, session TTL or session ageing setting for the relevant protocol and port! (this could be tricky, so do not treat this lightly)

The trick for success here is that timeout settings should be configured without overlapping one another while following the client access “path” – for example – Client > FW > Load Balancer > Server:

  • FW timeout TCP/IP timeout – 40 minutes
  • Load Balancer – TCP/IP timeout – 35 minutes
  • Server – TCP/IP timeout – 30 minutes

If additional network devices are placed between the server and your clients, make sure that session timeout settings continue to be configured accordingly.
With today’s security measures, network security has become much more complex. A typical corporate network will implement many different network appliances or software based solutions to secure data, restrict access, prevent attacks and unwanted traffic.
Bottom line – don’t think you are done with network considerations just because “ping works” or an email comes with a statement like “your port is now open”.

I hope this post will benefit others as this issue was and will probably remain common with Exchange and other client / server services.

Don’t get timed out 🙂

Additional useful links and sources of data:

Where did “newsid” go ? seems mark got the answer

Mark posted a great post about this ancient urban legend, “The Machine SID Duplication Myth”

For the record, just remember that NewSID was never the solution for imaging a computer as a template.
And i’ll quote some of mark’s post and leave you to read the rest on his blog..

On November 3 2009, Sysinternals retired NewSID, a utility that changes a computers machine Security Identifier (machine SID). I wrote NewSID in 1997 (its original name was NTSID) because the only tool available at the time for changing machine SIDs was the Microsoft Sysprep tool, and Sysprep doesn’t support changing the SIDs of computers that have applications installed. A machine SID is a unique identifier generated by Windows Setup that Windows uses as the basis for the SIDs for administrator-defined local accounts and groups. After a user logs on to a system, they are represented by their account and group SIDs with respect to object authorization (permissions checks). If two machines have the same machine SID, then accounts or groups on those systems might have the same SID. It’s therefore obvious that having multiple computers with the same machine SID on a network poses a security risk, right? At least that’s been the conventional wisdom.

….I realize that the news that it’s okay to have duplicate machine SIDs comes as a surprise to many, especially since changing SIDs on imaged systems has been a fundamental principle of image deployment since Windows NT’s inception. This blog post debunks the myth with facts by first describing the machine SID, explaining how Windows uses SIDs, and then showing that – with one exception – Windows never exposes a machine SID outside its computer, proving that it’s okay to have systems with the same machine SID. Note that Sysprep resets other machine-specific state that, if duplicated, can cause problems for certain applications like Windows Server Update Services (WSUS), so MIcrosoft’s support policy will still require cloned systems to be made unique with Sysprep.

Enjoy !

Thanks mark for the clarification.

Configure Session TTL / Timeout in Fortinet

Hey there Mobile admins..

Recently, I’ve did some troubleshooting with Fortinet and ActiveSync timeout, also known as Event ID 3030 Source: Server ActiveSync with the following being output to the Application Log on an Exchange Server 2003 and 2007.

Event Type: Warning
Event Source: Server ActiveSync
Event Category: None
Event ID: 3033
The average of the most recent [200] heartbeat intervals used by clients is less than or equal to [9]. Make sure that your firewall configuration is set to work correctly with Exchange ActiveSync and direct push technology. Specifically, make sure that your firewall is configured so that requests to Exchange ActiveSync do not expire before they have the opportunity to be processed.

Read more on the Direct Push in Technet : Understanding Direct Push , typically you will need to adjust your session TTL to no less then 12 minutes.

Fortinet  lists the official help on the subject in – FD31862 – Customizing Session TTL in FortiOS 4.0 , FortiOS 4 also allows this in Per rule ! so for all those with FortiOS 3 , use the mentioned KB from Fortinet try the FortiOS CLI Reference..

Usually i set this time out to no less the 15 minutes or 900 seconds.. you’r call 🙂

-updated the link to Fortinet KB