Troubleshooting Activity Runner

Activities remain in queued state

Normally,  when a user runs an activity from the website, it goes into a queued state. The website then sends a wakeup request to all ActivityRunners that it knows about and one of the runners picks up the activity. If all runners are busy, this can take some time, but if a runner is available, picking up should be almost instantaneous.

If sending the wakeup request fails for some reason, the activity runner will remain in a queued state and will not get picked up automatically unless the runner is restarted. This is because the runner checks for activities in the queue at startup, after finishing the running of any activity or when it gets a wakeup request. If it is not running any activity and it does not get a wakeup request, it will never pick up an activity on its own.

When a runner gets a wakeup request, this is logged in SheetKraft.log with a message of 'Received wakup'. If such a log entry is not found, there may be some network issue like a firewall that is blocking the request. This happens most commonly when an activity runner is installed on a different machine from the primary application server. One way to test this is to run the command Invoke-WebRequest -Uri http://ip:port/wakeup -Method post -TimeoutSec 10 on PowerShell. ip and port need to be replaced with appropriate values targeting the runner that is not receiving the wakeup request. This command is expected to return a successful but empty response (HTTP status code 204). If it times out or fails in any other way, there is a network problem that can be shown to the IT. The command can be run both from the primary application server and from the machine with the additional activity runner. If the command works from the activity runner machine but fails from the primary application server, this is a clear indication of some network problem like a misconfigured firewall.

Activities remain in queued state - Wake up Request to the Activity Runner Failing

Issue

The wakeup requests to the ActivityRunner service were failing even though the service was running on the same machine as the website. The url for the service was localhost.

Checks Done

telnet localhost <port> failed.

invoke-webrequest using PowerShell to the wakeup endpoint in ActivityRunner failed.

nslookup localhost failed.

invoke-webrequest using PowerShell to the website url with localhost was failing, but with hostname url was working

Background

http.sys is the driver that implements the http protocol in the networking system in Windows. It is specifically designed to optimize and accelerate the processing of HTTP requests.

Commands starting with net can be used to inspect and configure the settings for this driver.

For instance, the netstat -a command is used to display a list of all active network connections and listening ports. Using netstat -a is helpful for diagnosing network-related issues, identifying active connections, and determining which ports are being used by various network services on your system

The IIS provides additional functionality on top of http.sys. It has its own settings but also depends on the driver settings. Changes to IIS settings will work correctly only if the underlying driver settings are consistent with it and sometimes IIS Manager UI might get out of sync with the driver settings. For example, the hostname and IP address configured in Bindings, is probably maintained both by IIS and http.sys, and changes in the UI are supposed to automatically configure the driver settings. If someone changes the driver settings directly, or if there are bugs in the IIS Manager UI, some cleanup might need to be done via netsh (Network Shell) commands directly.

Note: Running these commands directly is risky, since it can result in mismatch between IIS and http.sys.

The ActivityRunner service uses .NET classes that build on top of http.sys without going through IIS. Any configuration needed for ActivityRunner must be done with netsh directly because there is no helpful IIS Manager UI. Usually, listening on localhost does not require any explicit settings or permissions. Listening on specific IP addresses or on standard ports like 80, 443, etc requires explicitly granted permissions.

Resolution

Run the netstat – a command to check all the registered listeners. If the url in runner config is localhost, the expected IP in the netstat output is 0.0.0.0 or 127.0.0.1.

If it is something else, go to IIS Bindings and delete them (remember what they were before deleting so that they can be recreated later), follow up with netsh http show iplisten and keep a record of what it shows.

Use netsh http delete iplisten to delete any unexpected IP bindings. 

Then go to IIS and recreate the desired bindings. netsh http add iplisten can be used to manually add the deleted bindings if something goes wrong.

Stack Overflow solution example

Eg:-

netsh http delete iplisten ipaddress=127.0.0.1

Once the configuration is removed run the netsh http add iplisten command to add the required IP configuration

Eg:-

netsh http add iplisten ipaddress=0.0.0.0

Restart the server in order for the changes to be made effective.