At https://github.com/n8n-io/n8n/pull/8213 we introduced Redis hashes
for workflow ownership and manual webhooks...
- to remove clutter from multiple related keys at the top level,
- to improve performance by preventing serializing-deserializing, and
- to guarantee atomicity during concurrent updates in multi-main setup.
Workflow activation errors can also benefit from this. Added test
coverage as well.
To test manually, create a workflow with a trigger with an invalid
credential, edit the workflow's `active` column to `true`, and restart.
The activation error should show as a red triangle on canvas and in the
workflow list.
Story: https://linear.app/n8n/issue/PAY-1188
- Implement Redis hashes on the caching service, based on Micha's work
in #7747, adapted from `node-cache-manager-ioredis-yet`. Optimize
workflow ownership lookups and manual webhook lookups with Redis hashes.
- Simplify the caching service by removing all currently unused methods
and options: `enable`, `disable`, `getCache`, `keys`, `keyValues`,
`refreshFunctionEach`, `refreshFunctionMany`, `refreshTtl`, etc.
- Remove the flag `N8N_CACHE_ENABLED`. Currently some features on
`master` are broken with caching disabled, and test webhooks now rely
entirely on caching, for multi-main setup support. We originally
introduced this flag to protect against excessive memory usage, but
total cache usage is low enough that we decided to drop this setting.
Apparently this flag was also never documented.
- Overall caching service refactor: use generics, reduce branching, add
discriminants for cache kinds for better type safety, type caching
events, improve readability, remove outdated docs, etc. Also refactor
and expand caching service tests.
Follow-up to: https://github.com/n8n-io/n8n/pull/8176
---------
Co-authored-by: Michael Auerswald <michael.auerswald@gmail.com>
In a multi-main setup, we have the following issue. The user's client
connects to main A and runs a test webhook, so main A starts listening
for a webhook call. A third-party service sends a request to the test
webhook URL. The request is forwarded by the load balancer to main B,
who is not listening for this webhook call. Therefore, the webhook call
is unhandled.
To start addressing this, cache test webhook registrations, using Redis
for queue mode and in-memory for regular mode. When the third-party
service sends a request to the test webhook URL, the request is
forwarded by the load balancer to main B, who fetches test webhooks from
the cache and, if it finds a match, executes the test webhook. This
should be transparent - test webhook behavior should remain the same as
so far.
Notes:
- Test webhook timeouts are not cached. A timeout is only relevant to
the process it was created in, so another process retrieving from Redis
a "foreign" timeout will be unable to act on it. A timeout also has
circular references, so `cache-manager-ioredis-yet` is unable to
serialize it.
- In a single-main scenario, the timeout remains in the single process
and is cleared on test webhook expiration, successful execution, and
manual cancellation - all as usual.
- In a multi-main scenario, we will need to have the process who
received the webhook call send a message to the process who created the
webhook directing this originating process to clear the timeout. This
will likely be implemented via execution lifecycle hooks and Redis
channel messages checking session ID. This implementation is out of
scope for this PR and will come next.
- Additional data in test webhooks is not cached. From what I can tell,
additional data is not needed for test webhooks to be executed.
Additional data also has circular references, so
`cache-manager-ioredis-yet` is unable to serialize it.
Follow-up to: #8155
also refactored the code to
1. stop passing around `scope === 'global'`, since this code can be used
only for changing globalRole.
2. leak less details when input validation fails.
## Review / Merge checklist
- [x] PR title and summary are descriptive
- [x] Tests included
## Summary
This PR continues refactoring webhooks code for better modularity.
Continued from #8069 to bring back `ActiveWebhooks`, but this time
actually handling active webhook calls in this class.
## Review / Merge checklist
- [x] PR title and summary are descriptive
This PR simplifies state in test webhooks so that it can be cached
easily. Caching this state will allow us to start using Redis for manual
webhooks, to support manual webhooks to work in multi-main setup.
- [x] Convert `workflowWebhooks` to a getter - no need to optimize for
deactivation
- [x] Remove array from value in `TestWebhooks.webhookUrls`
- [x] Consolidate `webhookUrls` and `registeredWebhooks`
- Move webhook, poller and trigger activation logs closer to activation
event
- Enrich response of `/debug/multi-main-setup`
- Ensure workflow updates broadcast activation state changes only if
state changed
- Fix bug on workflow activation after leadership change
- Ensure debug controller is not available in production
---------
Co-authored-by: Omar Ajoue <krynble@gmail.com>