|
I came across an issue with a client of ours yesterday. They are set up with 4 2008 server boxes, one running as a database server for our software, and TS license manager, as well as TS session broker. The other three run as terminal servers. What happened is that all of a sudden yesterday, all four servers somehow lost their trust tokens with the clients domain controller (a different server). I had to fix the issue by disjoining all four servers from the domain, then rejoining them. I have had this issue happen before on laptops that were not connected to the network for several weeks at a time, but never with servers. What typically causes this? The only abnormality I can think of is that the 4 servers were on their own Cisco switch/router, on a different subnet than any other equipment. Perhaps the Cisco device went down briefly causing the loss of sync?
|
| # ? Nov 03, 2009 18:42 |
|
|
| # ? Nov 21, 2009 16:25 |
|
Were the system event logs complaining about communication problems with the DC(s) before everything crapped out?
|
| # ? Nov 03, 2009 18:43 |
|
BangersInMyKnickers posted:Were the system event logs complaining about communication problems with the DC(s) before everything crapped out? LOL, thats the part I forgot to mention. The client is in a very remote location states away, and consequently their internet access goes down often (even though they have a business grade T1). I was unable to look at any of the logs, because I had to instruct one of their IT guys how to do all this over the phone. He should be calling me back though later after the internet access comes back up, so I can go over the logs.
|
| # ? Nov 03, 2009 18:50 |
|
BangersInMyKnickers posted:Were the system event logs complaining about communication problems with the DC(s) before everything crapped out? This would be my guess as well. Isn't the default Kerberos token valid for like 28 days though?
|
| # ? Nov 03, 2009 18:54 |
|
I can't remember the exact interval, but computer accounts in AD automatically update their password against the DC. There is a grace period where the old password becomes stale and needs to be refreshed and if you exceed that you can get exactly what you are describing where they just drop off the domain. Check those logs when you can, but if that is your situation then you may be better off dropping off a DC at that location and setting up a second domain site in AD and configuring the replication window between that site and your primary. That would leave something local to authenticate against in the (apparently likely) chance of network interruptions.
|
| # ? Nov 03, 2009 18:55 |
|
We've got a site 7,000 miles away in the Philippines which has typhoons all the time taking the internet down, and we have considerably fewer issues than you do. it sounds like there has go to be a better connectivity solution for you out there somewhere than what you are using because it sounds like a pair of cups and a string.
|
| # ? Nov 03, 2009 19:09 |
|
If your WAN link to the domain controller is that unreliable I'd suggest making one of your servers at the remote site a domain controller. It's much easier to make sure AD replication is working from the one DC, than to check each server individually for errors in the log.
|
| # ? Nov 03, 2009 21:11 |
|
Sorry, I wasn't too clear.... Their Domain controller is actually in the same room as the servers loaded with our software. The link speed between them is 100mbits. Which is why I am so confused on how this happened. Like I said the switch or router that has our 4 servers on it is on a different subnet than the rest of their network, it has to go through a gateway (one hop) when I do a trace route from our servers to their domain controller. Thats about the only thing out of the ordinary. Unfortunately I am not sure how their network topology is set up.
|
| # ? Nov 03, 2009 23:48 |
|
Its also possible someone accidentally deleted them out of active directory. If the domain is under their control and someone was "tinkering" with AD, this is most likely. That or something happened with the routing between those two subnets.
|
| # ? Nov 04, 2009 00:40 |
|
I doubt anyone deleted the servers out of the active directory. They only have 3 IT people, and I don't know if they even know how to do that. ![]() It probably does have something to do with routing between the subnets. The device our 4 servers is plugged into has like 96 gigabit ports on it, they just never implemented it for the rest of their network, we plugged our stuff into it. They have outside IT contractors that put all the Cisco networking equipment in. I think I will email them and see if they have any ideas.
|
| # ? Nov 04, 2009 03:14 |
|
Could be a silly question but is the local DC doing DNS also? AD requires DNS to do lookups for service providers (Kerberos and LDAP), so sometimes if the servers have trouble with DNS they'll fall off the domain. Also make sure that all the servers have that local DNS provider as their primary. Kerberos tickets are renewed every 10 hours, though they don't expire for days and connectivity can be reestablished for about a month like previously mentioned. The only oddity you'll see without additional event logging enabled would be an inability for new users to log on to the boxes in question.
|
| # ? Nov 04, 2009 04:11 |
|
LoKout posted:Could be a silly question but is the local DC doing DNS also? AD requires DNS to do lookups for service providers (Kerberos and LDAP), so sometimes if the servers have trouble with DNS they'll fall off the domain. Also make sure that all the servers have that local DNS provider as their primary. The local DC does DNS also, yes. I checked the error logs, the only thing I could find was that the system lost contact with the domain controller, thus it could not apply group policy.
|
| # ? Nov 07, 2009 02:01 |








