Technology, Linux, Windows

Ansible connection timed out to Domain Controller

I recently setup Ansible to manage my virtual servers, and I’m using it to apply updates to both Linux and Windows.  I ran into this issue specifically related to updating Windows domain controllers, and it seemed to be stumping a lot of other folks as you can see here.

The gist of the problem is that I had disabled the built in Administrator account in Active Directory and had instead created a service account for Ansible to use to run my playbook against windows machines.  It worked fine on every workstation and server I had, except for Domain Controllers.  I don’t think that it matters, but just in case, my DCs are running Windows Server 2012 R2.  Whenever I would attempt to run the playbook against them, it would create the job in task scheduler to run the PowerShell scripts necessary to check for and apply updates, but they would fail to start.  Running the playbook with -vvv option would just get you a message that it timed out waiting for the job to start.  Likewise, I didn’t see anything out of the ordinary in the Event Logs on the servers either.  I found that if I enabled the builtin Administrator account in AD, and allowed Ansible to use those credentials, it would work just fine.

It wasn’t until I logged into one of the domain controllers after allowing the Ansible playbook to fail and was checking the scheduled tasks that I realized what the issue was.  You can find the job that Ansible creates by opening up Computer Management and going to Task Scheduler > Task Scheduler Library > Microsoft > Windows > PowerShell > ScheduledJobs (If there is a failed job here, you must delete it as it will cause your playbook to fail on the next run because it cannot create the job).  I was checking the properties of the job, which was created with my service account, and I changed the credentials to the builtin administrator account and started it manually and it ran just fine.  So I changed the credentials back to my service account and attempted to run the job again and that is when I got a useful error message stating that the account must have “Logon as batch job” rights in order to perform the task.

Those rights are managed via local group policy, however domain controllers are a bit different, and the setting must be configured in Group Policy Management since you can’t modify it on the machine directly.  In order to add the rights to your service account you will need to either modify the default domain controller policy, or create a separate policy and link it to your Domain Controllers container.  In that policy, navigate to Computer Configuration > Windows Settings > Security Settings > Local Policies > User Rights Assignment.  Locate “Logon as a batch job” in the list and double click it, and then add your user or group.

Once that is done: run gpupdate /force on your DCs and make sure any failed scheduled tasks created by Ansible previously have been deleted and then try the playbook again and it should run without a hitch.

For any who might be interested, this is the playbook I’m using:

I posted that same information to the discussion on github that I linked at the top of the article and was advised the information would be added to documentation, and that it might also be possible to put some checks into the module to allow Ansible to give you better guidance on how to correct the issue during the course of operation as well.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.