Story: https://linear.app/n8n/issue/PAY-926
This PR coordinates workflow activation on instance startup and on
leadership change in multiple main scenario in the internal API. Part 3
on manual workflow activation and deactivation will be a separate PR.
### Part 1: Instance startup
In multi-main scenario, on starting an instance...
- [x] If the instance is the leader, it should add webhooks, triggers
and pollers.
- [x] If the instance is the follower, it should not add webhooks,
triggers or pollers.
- [x] Unit tests.
### Part 2: Leadership change
In multi-main scenario, if the main instance leader dies…
- [x] The new main instance leader must activate all trigger- and
poller-based workflows, excluding webhook-based workflows.
- [x] The old main instance leader must deactivate all trigger- and
poller-based workflows, excluding webhook-based workflows.
- [x] Unit tests.
To test, start two instances and check behavior on startup and
leadership change:
```
EXECUTIONS_MODE=queue N8N_LEADER_SELECTION_ENABLED=true N8N_LICENSE_TENANT_ID=... N8N_LICENSE_ACTIVATION_KEY=... N8N_LOG_LEVEL=debug npm run start
EXECUTIONS_MODE=queue N8N_LEADER_SELECTION_ENABLED=true N8N_LICENSE_TENANT_ID=... N8N_LICENSE_ACTIVATION_KEY=... N8N_LOG_LEVEL=debug N8N_PORT=5679 npm run start
```
This PR ensures `MultiMainInstancePublisher` is initialized before
checking if the instance is leader or follower. Followers skip license
init, license check, and pruning start and stop.
https://linear.app/n8n/issue/PAY-933/set-up-leader-selection-for-multiple-main-instances
- [x] Set up new envs
- [x] Add config and license checks
- [x] Implement `MultiMainInstancePublisher`
- [x] Expand `RedisServicePubSubPublisher` to support
`MultiMainInstancePublisher`
- [x] Init `MultiMainInstancePublisher` on startup and destroy on
shutdown
- [x] Add to sandbox plans
- [x] Test manually
Note: This is only for setup - coordinating in reaction to leadership
changes will come in later PRs.
This change ensures that things like `encryptionKey` and `instanceId`
are always available directly where they are needed, instead of passing
them around throughout the code.
all commands sent between main instance and workers need to contain a
server id to prevent senders from reacting to their own messages,
causing loops
this PR makes sure all sent messages contain a sender id by default as
part of constructing a sending redis client.
---------
Co-authored-by: कारतोफ्फेलस्क्रिप्ट™ <aditya@netroy.in>
Depends on: #7092 | Story:
[PAY-768](https://linear.app/n8n/issue/PAY-768)
This PR:
- Generalizes the `IBinaryDataManager` interface.
- Adjusts `Filesystem.ts` to satisfy the interface.
- Sets up an S3 client stub to be filled in in the next PR.
- Turns `BinaryDataManager` into an injectable service.
- Adjusts the config schema and adds new validators.
Note that the PR looks large but all the main changes are in
`packages/core/src/binaryData`.
Out of scope:
- `BinaryDataManager` (now `BinaryDataService`) and `Filesystem.ts` (now
`fs.client.ts`) were slightly refactored for maintainability, but fully
overhauling them is **not** the focus of this PR, which is meant to
clear the way for the S3 implementation. Future improvements for these
two should include setting up a backwards-compatible dir structure that
makes it easier to locate binary data files to delete, removing
duplication, simplifying cloning methods, using integers for binary data
size instead of `prettyBytes()`, writing tests for existing binary data
logic, etc.
---------
Co-authored-by: कारतोफ्फेलस्क्रिप्ट™ <aditya@netroy.in>
Based on #7065 | Story: https://linear.app/n8n/issue/PAY-771
n8n on filesystem mode marks binary data to delete on manual execution
deletion, on unsaved execution completion, and on every execution
pruning cycle. We later prune binary data in a separate cycle via these
marker files, based on the configured TTL. In the context of introducing
an S3 client to manage binary data, the filesystem mode's mark-and-prune
setup is too tightly coupled to the general binary data management
client interface.
This PR...
- Ensures the deletion of an execution causes the deletion of any binary
data associated to it. This does away with the need for binary data TTL
and simplifies the filesystem mode's mark-and-prune setup.
- Refactors all execution deletions (including pruning) to cause soft
deletions, hard-deletes soft-deleted executions based on the existing
pruning config, and adjusts execution endpoints to filter out
soft-deleted executions. This reduces DB load, and keeps binary data
around long enough for users to access it when building workflows with
unsaved executions.
- Moves all execution pruning work from an execution lifecycle hook to
`execution.repository.ts`. This keeps related logic in a single place.
- Removes all marking logic from the binary data manager. This
simplifies the interface that the S3 client will meet.
- Adds basic sanity-check tests to pruning logic and execution deletion.
Out of scope:
- Improving existing pruning logic.
- Improving existing execution repository logic.
- Adjusting dir structure for filesystem mode.
---------
Co-authored-by: कारतोफ्फेलस्क्रिप्ट™ <aditya@netroy.in>
This PR implements the updated license SDK so that worker and webhook
instances do not auto-renew licenses any more.
Instead, they receive a `reloadLicense` command via the Redis client
that will fetch the updated license after it was saved on the main
instance
This also contains some refactoring with moving redis sub and pub
clients into the event bus directly, to prevent cyclic dependency
issues.
* fix(core): Make node execution order configurable, and backward-compatible
* ⚡ Also add new Merge-Node behaviour
* ⚡ Fix typo
* Fix lint issue
* update labels
* rename legacy to v0
* remove the unnecessary log
* default all new workflows to use v1 execution-order
* remove the controller changes
* clone default settings to avoid it getting modified
---------
Co-authored-by: Jan Oberhauser <jan.oberhauser@gmail.com>
* remove unnecesary Db re-initialization
this is from before we added `Db.init` in `WorkflowRunnerProcess`
* feat(core): Improved health check
* make health check not care about DB connections
* close DB connections, and shutdown the timer
If you install a community node with `polling: true`, activating a workflow with that node fails with an error: `WorkflowActivationError: There was a problem activating the workflow: "Could not get parameter "pollTimes"!"`.
You can test this by installing `n8n-nodes-rss-feed-trigger`, creating a workflow with the `RSS Trigger` node, and then trying to activate it. Activation will fail on `master`, but work as expected on this branch.
* use typedi for UserManagementMailer
* use typedi for SamlService
* fix typos
* use typedi for Queue
* use typedi for License
* convert some more code to use typedi
* add typedi
* convert ActiveWorkflowRunner into an injectable service
* convert ExternalHooks into an injectable service
* convert InternalHooks into an injectable service
* convert LoadNodesAndCredentials into an injectable service
* convert NodeTypes and CredentialTypes into an injectable service
* convert ActiveExecutions into an injectable service
* convert WaitTracker into an injectable service
* convert Push into an injectable service
* convert ActiveWebhooks and TestWebhooks into an injectable services
* handle circular references, and log errors when a circular dependency is found
* feat(editor): roll out schema view
* feat(editor): add posthog tracking
* refactor: use composables
* refactor: clean up console log
* refactor: clean up impl
* chore: clean up impl
* fix: fix demo var
* chore: add comment
* refactor: clean up
* chore: wrap error func
* refactor: clean up import
* refactor: make store
* feat: enable rudderstack usebeacon, move event to unload
* chore: clean up alert
* refactor: move tracking from hooks
* fix: reload flags on login
* fix: add func to setup
* fix: clear duplicate import
* chore: add console to tesT
* chore: add console to tesT
* fix: try reload
* chore: randomize instnace id for testing
* chore: randomize instnace id for testing
* chore: add console logs for testing
* chore: move random id to fe
* chore: use query param for testing
* feat: update PostHog api endpoint
* feat: update rs host
* feat: update rs host
* feat: update rs endpoints
* refactor: use api host for BE events as well
* refactor: refactor out posthog client
* feat: add feature flags to login
* feat: add feature flags to login
* feat: get feature flags to work
* feat: add created at to be events
* chore: add todos
* chore: clean up store
* chore: add created at to identify
* feat: add posthog config to settings
* feat: add bootstrapping
* chore: clean up
* chore: fix build
* fix: get dates to work
* fix: get posthog to recognize dates
* chore: refactor
* fix: update back to number
* fix: update key
* fix: get experiment evals to work
* feat: add posthog to signup router
* feat: add feature flags on sign up
* chore: clean up
* fix: fix import
* chore: clean up loading script
* feat: add timeout, fix: script loader
* fix: test timeout and get working on 8080
* refactor: move out posthog
* feat: add experiment tracking
* fix: clear tracked on reset
* fix: fix signup bug
* fix: handle errors when telmetry is disabled
* refactor: remove redundant await
* fix: add back posthog to telemetry
* test: fix test
* test: fix test
* test: add tests for posthog client
* lint: fix
* fix: fix issue with slow decide endpoint
* lint: fix
* lint: fix
* lint: fix
* lint: fix
* chore: address PR feedback
* chore: address PR feedback
* feat: add onboarding experiment
* fix(editor): Prevent creation of input connections for nodes without input
* WIP: Workflow checks service and controller
* fix: Created SQLite migration to remove broken connections
* Cleanup & add mysql/posgres migrations
* Linter fixes
* Unify the migration scripts
* Escape migration workflow_entity
* Wrap the migration in try/catch and do not parse nodes and connection if mysql/postgres
* Do migration changes also fro mysql
* refactor: Wrap only the necessary call in try catch block
---------
Co-authored-by: Omar Ajoue <krynble@gmail.com>