Feat/introduce ADRs (#200)

* feat: introduce ADRs to the codebase * docs: decision about release strategy
fishjam-dev · May 24, 2024 · 0b3270f · 0b3270f
1 parent 6e66990
commit 0b3270f
Show file tree

Hide file tree

Showing 3 changed files with 197 additions and 0 deletions.
diff --git a/docs/adrs/001-start-using-adrs.md b/docs/adrs/001-start-using-adrs.md
@@ -0,0 +1,43 @@
+---
+status: accepted
+date: 2024-05-17
+deciders: Fishjam team
+---
+# Start using ADRs to document any major (non)architectural decisions for the project
+
+## Context and Problem Statement
+
+- Right now, we don't write down architecture decisions in any standardized format.
+- We would like to keep a log of any major decisions made which affect the application.
+
+## Decision Drivers
+
+* One place to keep all the decisions made
+* One format to follow
+* Easy to read and write
+* Close to the code
+
+## Considered Options
+
+* ADRs inside the repo
+* Google Docs
+* Confluence
+
+## Decision Outcome
+
+Chosen option: "ADRs inside the repo", because:
+
+- Documentation is near the code and is easily accessible for anyone (contributors)
+- No need to lock on to other providers
+- Support for Markdown
+
+### Consequences
+
+From now on, when a major decision is made, we are going to write an ADR for it.
+Anything that changes the architecture, provider, configuration, etc. should have an ADR. For other cases, the contributor shall decide if one is necessary.
+If a request for an ADR arises on PR, it's recommended to provide one with clear explanation for the decision.
+We write this to have a record and also to be able to come back to it later on in the future. Because of that, we are going to keep the standard described in the template.
+
+## More Information
+
+We won't write ADRs for decisions that occurred in the past. This is a change that's going to have effect from now on.
diff --git a/docs/adrs/002-use-rolling-update-with-eviction.md b/docs/adrs/002-use-rolling-update-with-eviction.md
@@ -0,0 +1,82 @@
+---
+status: accepted
+date: 2024-05-17
+deciders: Kamil Kołodziej, Radosław Szuma, Jakub Pisarek
+informed: Fishjam Team, Cloud Team
+---
+# Change deployment strategy to eliminate downtime
+
+## Context and Problem Statement
+
+In the current situation when we do a deployment we are shuting down the server and every room with that.
+This is an awful user experience. Another issue is the fact that any ongoing recordings are lost.
+
+## Decision Drivers
+
+* No downtime while deployment happens
+* 0 lost recordings during deployment
+
+## Considered Options
+
+* Rolling update with eviction
+* Active room migration
+
+## Decision Outcome
+
+Chosen option: "Rolling update with eviction", because
+while second option sounds great, we acknowledge that solution is not trivial and we want to fix downtimes ASAP. First solution is a good starting point to remove existing problems and allow us to move forward.
+
+### Consequences
+
+Every new deployment will need to trigger fishjam process which will handle the shutdown.
+What that effectively means is that we are leaving the responsibility of triggering deployment to external orchestration tool (what is usually the case) but we handle the process inside the app.
+
+External process will trigger fishjam shutdown process by a SIGTERM.
+This is gonna mark fishjam as one that no longer accepts new rooms.
+Once every room is closed and all of the recordings are computed we are going to shutdown that instance of fishjam.
+
+This is applicable one by one for every instance in cluster, although it may take some time to deployment a new version and we may have 2 different versions on the cluster at the same time, we are accepting that tradeoff.
+We must also consider `force` option to deploy a version no matter the state of fishjam (which may result in downtime).
+
+## Pros and Cons of the Options
+
+### Rolling update with eviction
+
+1. External orchestrator process triggers new deployment
+2. Starts new instance of fishjam with new version
+3. One of the fishjam instances receives SIGTERM
+4. Fishjam shutdown process start
+5. We mark that fishjam as one that no longer allows to create new rooms
+6. Wait till the last room is closed and all recordings are completed
+7. Once process is completed we shutdown the instance
+8. (Possibly) Repeat for rest of the remaining instances
+
+* Good, because we will eliminate downtimes with deployments
+* Good, because we won't lose any recordings
+* Good, because we trap the SIGTERM and handle shutdown gracefully
+* Bad, because deployment process may take some time (effectively as long as the longest conversation/stream)
+* Bad, because we may end up with different versions on cluster at the same time
+
+###  Active room migration
+
+This solution wasn't researched much so we supposed the flow should be like that:
+
+1. External orchestrator process triggers new deployment
+2. Starts new instance of fishjam with new version
+3. One of the old fishjam instances receives SIGTERM
+4. We mark that fishjam as one that no longer allows to create new rooms
+5. Fishjam starts migrating rooms to new instance
+6. Somehow handles the recordings (?)
+7. Once peers/rooms/streams are migrated, app is gonna shutdown
+8. (Possibly) Repeat for rest of the remaining instances
+
+* Good, because we will eliminate downtimes with deployments
+* Good, because we won't lose any recordings
+* Good, because it happens fast, we don't have to wait for rooms to close/streams to end
+* Good, because we will have 2 different versions on cluster for a short amount of time
+* Bad, because we don't have a clue how to handle the active peer migration to new instance right now
+* Bad, because we don't know how to handle the meetings with recording enabled during the migration
+
+## More Information
+
+This decision is heavily dependent on the Cloud Team and may be changed soon to meet their requirements.
diff --git a/docs/adrs/template.md b/docs/adrs/template.md
@@ -0,0 +1,72 @@
+---
+# These are optional elements. Feel free to remove any of them.
+status: "{proposed | rejected | accepted | deprecated | … | superseded by [ADR-0005](0005-example.md)}"
+date: {YYYY-MM-DD when the decision was last updated}
+deciders: {list everyone involved in the decision}
+consulted: {list everyone whose opinions are sought (typically subject-matter experts); and with whom there is a two-way communication}
+informed: {list everyone who is kept up-to-date on progress; and with whom there is a one-way communication}
+---
+# {short title of solved problem and solution}
+
+## Context and Problem Statement
+
+{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story.
+ You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
+
+<!-- This is an optional element. Feel free to remove. -->
+## Decision Drivers
+
+* {decision driver 1, e.g., a force, facing concern, …}
+* {decision driver 2, e.g., a force, facing concern, …}
+* … <!-- numbers of drivers can vary -->
+
+## Considered Options
+
+* {title of option 1}
+* {title of option 2}
+* {title of option 3}
+* … <!-- numbers of options can vary -->
+
+## Decision Outcome
+
+Chosen option: "{title of option 1}", because
+{justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
+
+<!-- This is an optional element. Feel free to remove. -->
+### Consequences
+
+* {Try to describe positive and negative consequences, every solution has some tradeoffs.}
+* … <!-- numbers of consequences can vary -->
+
+<!-- This is an optional element. Feel free to remove. -->
+## Pros and Cons of the Options
+
+### {title of option 1}
+
+<!-- This is an optional element. Feel free to remove. -->
+{example | description | pointer to more information | …}
+
+* Good, because {argument a}
+* Good, because {argument b}
+<!-- use "neutral" if the given argument weights neither for good nor bad -->
+* Neutral, because {argument c}
+* Bad, because {argument d}
+* … <!-- numbers of pros and cons can vary -->
+
+### {title of other option}
+
+{example | description | pointer to more information | …}
+
+* Good, because {argument a}
+* Good, because {argument b}
+* Neutral, because {argument c}
+* Bad, because {argument d}
+* …
+
+<!-- This is an optional element. Feel free to remove. -->
+## More Information
+
+{You might want to provide additional evidence/confidence for the decision outcome here and/or
+ document the team agreement on the decision and/or
+ define when/how this decision the decision should be realized and if/when it should be re-visited.
+Links to other decisions and resources might appear here as well.}