Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign Build Deployment Process (External) #961

Merged
merged 96 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
93c2232
Read secret_list from actual file if present, else use sample file.
Mar 21, 2024
a80d7e4
Matched conf/log files with internal repo
Mar 26, 2024
20e1856
Reading push.json values from environment variables
Mar 29, 2024
4a1005e
Choose analysis/debug.conf file based on ENV var
Apr 1, 2024
d4dad4a
Testing method to share tag between repos
nataliejschultz Apr 4, 2024
02c7fc7
Changed webserver.conf.json to Environment variables + Removed sed / …
Apr 10, 2024
4aeb4a6
Corrected logic to set test webserver ENV variables
Apr 11, 2024
2531672
Changed Webserver.conf + Db.conf to Environment variable + Removed se…
Apr 12, 2024
ea57afa
Reverting Natalie testing artifact changes + Adding image-push-merge
Apr 12, 2024
e8344e9
Fixes for failing TestTokenQueries print assertions
Apr 12, 2024
7f0d5f0
Fixes for failing TestTokenQueries print assertions
Apr 12, 2024
05551c8
Try-except block brought to the top
Apr 16, 2024
cabc98a
Merge pull request #2 from MukuFlash03/consolidate-differences
MukuFlash03 Apr 16, 2024
c39b537
Removed extraneous seed_model.json
Apr 16, 2024
385bc97
Merge pull request #3 from MukuFlash03/consolidate-differences
MukuFlash03 Apr 16, 2024
273c2ca
TODO added to change to master branch in YML file
Apr 24, 2024
2fdb469
Adding image-push-merge branch to automated CI/CD tests
Apr 24, 2024
91abdea
Upload artifact test - 1
Apr 25, 2024
bbdaaae
Upload artifact test - 2
Apr 25, 2024
5d0ca02
Added temporary test file
Apr 25, 2024
2e8a911
Changes to actions + echo
nataliejschultz Apr 25, 2024
f5aae47
Upload artifact test - 3
Apr 25, 2024
051ef66
Repository dispatch send - 1
Apr 26, 2024
6c44865
Workflow dispatch send - 1
Apr 26, 2024
4569dfb
Workflow dispatch send - 2
Apr 26, 2024
b562c2c
Workflow dispatch send - 3
Apr 26, 2024
25109b3
Workflow dispatch send - 3
Apr 26, 2024
709f3cf
Workflow dispatch send - 4
Apr 26, 2024
4b42185
Workflow dispatch send - 5
Apr 26, 2024
12d09ae
Workflow dispatch send - 6
Apr 26, 2024
94c129b
Workflow dispatch send - 7
Apr 26, 2024
0dd1245
Matrix build send - 1
Apr 26, 2024
5434914
Matrix build send - 2
Apr 26, 2024
0fd3e9d
Matrix build send - 3
Apr 26, 2024
ab399ea
Matrix build send - 3
Apr 26, 2024
91c0c1f
Matrix build send - 4
Apr 26, 2024
acecf3e
Matrix build send - 5
Apr 26, 2024
e778b3f
Fix for "url" KeyError observed in public-dash redesign testing
Apr 30, 2024
a0190d4
Fix for "url" KeyError observed in public-dash redesign testing
Apr 30, 2024
776f0b9
Artifact + Matrix - 1
May 2, 2024
17ac3cc
Artifact + Matrix - 2
May 2, 2024
706f74c
Artifact + Matrix - 3
May 2, 2024
d724fd1
Revert "Adding image-push-merge branch to automated CI/CD tests"
May 2, 2024
e6a2d79
Revert "TODO added to change to master branch in YML file"
May 2, 2024
4b3dfdf
Merge branch 'image-push-merge' into tags-combo-approach
May 2, 2024
f033f06
Added TODOs in github actions workflow YAML file
May 3, 2024
1207d79
Artifact + Matrix - 4
May 3, 2024
f306e8c
Merge branch 'consolidate-differences' into tags-combo-approach
nataliejschultz May 6, 2024
00d9565
Merge pull request #4 from MukuFlash03/tags-combo-approach
nataliejschultz May 6, 2024
f182790
Cleanup changes
nataliejschultz May 9, 2024
f1869ab
Update image_build_push.yml
nataliejschultz May 9, 2024
8f05955
More cleanup + testing image build?
nataliejschultz May 9, 2024
912bd34
Hardcoded webhost
nataliejschultz May 16, 2024
738b629
secret.py
nataliejschultz May 17, 2024
7102508
Restore intake.conf.sample
nataliejschultz May 17, 2024
3d77439
Reverting webserver.conf.sample
nataliejschultz May 17, 2024
a0f2424
Removing check_unset_env_vars
nataliejschultz May 17, 2024
9fb0e5a
Merge branch 'consolidate-differences' of https://github.com/MukuFlas…
nataliejschultz May 17, 2024
18a8872
Removing check_unset_env_vars functionality
nataliejschultz May 17, 2024
53015c7
Update image_build_push.yml
nataliejschultz May 21, 2024
41a410c
Update docker_start_script.sh
nataliejschultz May 21, 2024
7405ff1
Setting DB_HOST=db
nataliejschultz May 21, 2024
cd62247
Update docker_start_script.sh
nataliejschultz May 22, 2024
ebc8188
Rename debug.conf.internal.json to debug.conf.prod.json
nataliejschultz May 22, 2024
41ae79f
Update config.py
nataliejschultz May 22, 2024
290b0fc
Push to rename
nataliejschultz May 22, 2024
29869cd
Update and rename debug.conf.json.sample to debug.conf.dev.json
nataliejschultz May 22, 2024
38209ef
common.py fix?
nataliejschultz May 22, 2024
da485c2
Removing redundant DB_HOST setting
nataliejschultz May 23, 2024
e6b388b
reverting dockerfile + start script changes
nataliejschultz May 24, 2024
bb03c42
Testing workflows with compose
nataliejschultz May 26, 2024
cf50d17
Triggering workflows.
nataliejschultz May 26, 2024
6a21f5d
Reverting image_build_push.yml
nataliejschultz May 26, 2024
b961417
Not showing changes to branches for some reason in image_build_push.yml
nataliejschultz Jun 10, 2024
a245e7d
Adding comment to see if it resolves github display error
nataliejschultz Jun 17, 2024
2067055
🩹 Don't cat the db.conf file
shankari Aug 7, 2024
3000758
🩹 Use the correct filename in the gitignore
shankari Aug 7, 2024
6a8a13f
🩹 Set the default value for the `DB_HOST` as well
shankari Aug 7, 2024
fc183cb
🩹 Unify the supported user inputs across the debug and prod configs
shankari Aug 7, 2024
51e16de
🩹 Remove the cat of db.conf from the integration tests as well
shankari Aug 7, 2024
7ab37b6
🩹 Cleanup environment variables in the basic start script
shankari Aug 7, 2024
727c00c
♻️ Move the config to a different file name that makes more sense
shankari Aug 10, 2024
7f1be92
♻️ Refactor the backwards config file to be reusable
shankari Aug 10, 2024
7177e71
🔊 log the full backtrace if the config file is formatted incorrectly
shankari Aug 10, 2024
168ef10
♻️ Move the api configuration into the backwards compat as well
shankari Aug 10, 2024
3dea305
♻️ Move the api configuration into the backwards compat as well
shankari Aug 10, 2024
a0f0c6a
♻️ Move the push configuration into the backwards compat as well
shankari Aug 12, 2024
10624a6
🔊 Indicate that we are using the default production config
shankari Aug 12, 2024
0fe9476
Merge branch 'consolidate-differences' of https://github.com/MukuFlas…
shankari Aug 12, 2024
11d2a89
♻️ Pull out the code to reset the environment variable overrides to …
shankari Aug 12, 2024
4cc4c58
✅ Fix the expected text while checking for tokens
shankari Aug 12, 2024
357d4b8
🔥 Remove the copied over config file
shankari Aug 12, 2024
b6f59b0
♻️ Access the environment variables from the config using `.get`
shankari Aug 12, 2024
ec38835
✅ Remove environment variables that are likely to be different across…
shankari Aug 12, 2024
1a0d451
✅ Delete all irrelevant config variables
shankari Aug 12, 2024
2fe0816
✅ Copy/paste the actual tests from the failed CI run
shankari Aug 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 3 additions & 13 deletions .docker/docker_start_script.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,11 @@
echo ${DB_HOST}
if [ -z ${DB_HOST} ] ; then
local_host=`hostname -i`
jq --arg db_host "$local_host" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
else
jq --arg db_host "$DB_HOST" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
export DB_HOST=$local_host
echo "Setting db host environment variable to localhost"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. @nataliejschultz have you tried testing this with DB_HOST not set? I bet right now that it doesn't work

Copy link
Contributor

@nataliejschultz nataliejschultz May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari You're right, it does not work without DB_HOST set. I ran the server using my locally built image:

docker run --name em-server-no-host -d -e WEB_SERVER_HOST=0.0.0.0 -e STUDY_CONFIG=stage-program --network emission consolidate-diff-server-localbuild

and before I could exec into the container, it gave me this error:

2024-05-13 15:28:05 pymongo.errors.ServerSelectionTimeoutError: 172.26.0.3:27017: [Errno 111] Connection refused, Timeout: 30s, Topology Description: <TopologyDescription id: 664285c63b13fd554aa58ca9, topology_type: Unknown, servers: [<ServerDescription ('172.26.0.3', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('172.26.0.3:27017: [Errno 111] Connection refused')>]>

Copy link
Contributor

@nataliejschultz nataliejschultz May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari I've tried several different ways to export the DB_HOST environment variable, but kept getting the connection refused error. I decided to try running the server container as it currently is in production by building a local image with an added echo.

docker build -t localhost-5 .
docker run --name em-server-no-host -d -e WEB_SERVER_HOST=0.0.0.0 -e STUDY_CONFIG=stage-program --network emission localhost-5

The echo I added is in docker_start_script.sh, after it's supposedly set with the sample file:

echo ${DB_HOST}
if [ -z ${DB_HOST} ] ; then
    local_host=`hostname -i`
    jq --arg db_host "$local_host" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
else
    jq --arg db_host "$DB_HOST" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
fi
cat conf/storage/db.conf

echo "LOCALHOST!!:" ${DB_HOST}

Container log:

2024-05-14 12:55:08 {
2024-05-14 12:55:08   "timeseries": {
2024-05-14 12:55:08     "url": "172.26.0.3",
2024-05-14 12:55:08     "result_limit": 250000
2024-05-14 12:55:08   }
2024-05-14 12:55:08 }
2024-05-14 12:55:08 LOCALHOST!!:

I got the connection refused error again. Thoughts:

  • It isn't working in the first place
  • I'm misunderstanding what's happening
  • My db container is the problem

Going to try a new db container first and see if that's the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried new DB container:
docker run --name db -d -p 27017:27017 --network emission mongo:4.4.0

Ran the exact same image as before, and got the same connection refused error.

Copy link
Contributor

@shankari shankari May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the challenge outlined in
#961 (comment)

The previous codebase assumed that if the DB_HOST was not set, the DB and the webapp were running on the same host. Obviously, if you try to test it with containers, it won't work.

However, with the new codebase, even if you run the DB and the webapp on the same container, it won't work because of #961 (comment)

Copy link
Contributor

@nataliejschultz nataliejschultz May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the group meeting with @shankari, we clarified that:

  • The else was removed, so this code should work as expected
  • This method of setting the variable doesn't work when using containers, explaining the failures in my tests above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't work with containers, then do we even need this functionality in a file that only runs when the dockerfile is run?? It makes sense in start_script.sh, but maybe not here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I was hoping you would come to that. A lot of these decisions were made a long time ago, when our deployment profile was very different than what we do now. If you recall, part of this redesign was to simplify and unify these startup scripts.

You should ask yourself these questions, answer them and then keep only the parts that are relevant.

Even in start_script.sh, why do you need this, given that there is a fallback in the config file?
And if you do need it, why does it need to be hostname -i?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I finally understand.
DB_HOST will never need to be set when running with containers

  • because it should be specified in the compose for testing and
  • containers run on their own IP(unless configured otherwise), so they're not going to connect using localhost anyway

start_script.sh runs all the tests, and the tests (and every server call in general) call get_database.py which gets the environment variable from emission.core.config. This has the fallback:

def get_config_data_from_env():
    config_data_env = {
        "timeseries": {
            "url": os.getenv('DB_HOST', "localhost"),

using hostname -i or localhost makes sense when not using containers, and the fallback to localhost should account for that. It seems like using localhost would be a rare case for someone running the server locally without containers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataliejschultz it is actually the most common case (at least now) for developers who are writing server code, aka making changes to server functionality and features, and not only the build scripts

fi
cat conf/storage/db.conf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cat conf/storage/db.conf


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have removed this, there is really no reason to cat conf/storage/db.conf since it won't exist anyway

#set Web Server host using environment variable
echo ${WEB_SERVER_HOST}
if [ -z ${WEB_SERVER_HOST} ] ; then
local_host=`hostname -i`
sed "s_localhost_${local_host}_" conf/net/api/webserver.conf.sample > conf/net/api/webserver.conf
else
sed "s_localhost_${WEB_SERVER_HOST}_" conf/net/api/webserver.conf.sample > conf/net/api/webserver.conf
fi
cat conf/net/api/webserver.conf

if [ -z ${LIVERELOAD_SRC} ] ; then
echo "Live reload disabled, "
else
Expand All @@ -35,6 +24,7 @@ fi

#TODO: start cron jobs
# change python environment
echo "Starting up e-mission-environment..."
source setup/activate.sh

# launch the webapp
Expand Down
53 changes: 41 additions & 12 deletions .github/workflows/image_build_push.yml
Original file line number Diff line number Diff line change
@@ -1,29 +1,21 @@
# This is a basic workflow to help you get started with Actions

name: docker image

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
push:
branches: [ master, gis-based-mode-detection ]

branches: [ master, gis-based-mode-detection, consolidate-differences ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataliejschultz why do we have the consolidate-differences branch in here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has not been fixed yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to leave it and continue to test in GH actions as the cleanup changes are made, but I removed it now!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has still not been fixed

Copy link
Contributor

@nataliejschultz nataliejschultz Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shankari I have noticed a strange issue that I described to @MukuFlash03 ; some of the changes I made (such as removing consolidate-differences branch) do not appear to have taken effect in the diff display here. However, when I go to "edit file" in GitHub, the file is correct. It also looks correct in my IDE. I'm not sure why it's doing this 😞

# Env variable
env:
DOCKER_USER: ${{secrets.DOCKER_USER}}
DOCKER_PASSWORD: ${{secrets.DOCKER_PASSWORD}}

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest

# Steps represent a sequence of tasks that will be executed as part of the job
outputs:
date: ${{ steps.date.outputs.date }}

steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2
- name: docker login
run: | # log into docker hub account
Expand All @@ -46,3 +38,40 @@ jobs:
- name: push docker image
run: |
docker push $DOCKER_USER/${GITHUB_REPOSITORY#*/}:${GITHUB_REF##*/}_${{ steps.date.outputs.date }}

- name: Create a text file
run: |
echo ${{ steps.date.outputs.date }} > tag_file.txt
echo "Created tag text file"

- name: Upload Artifact
uses: actions/upload-artifact@v4
with:
name: docker-image-tag
path: tag_file.txt
overwrite: true

Comment on lines +48 to +53
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need the tag_file.txt to be uploaded here given that we are passing it directly to the workflows on line 77?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Mukul's comment in the issue:

Why I chose to add artifact method as well?

The issue I was facing was with fetching the latest timestamp for the image tag in case of a push event trigger. This is because in the workflow dispatch, the server workflow itself would trigger the workflows and hence was in a way connected to these workflows. However, push events would only trigger the specific workflow in that specific dashboard repository to build and push the image and hence would not be able to retrieve the image tag directly.

So, I utilized the artifact upload and download method to:

  • upload the image timestamp as an artifact in the workflow run for future use.
  • download the uploaded artifact from the latest previously successful and completed workflow run in e-mission-server repo for a specific branch (currently set to tags-combo-approach but to be changed to master once changes are final).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the issue here. You have to deal with dispatch and push separately anyway - one of them will have the image tag and the other will not. And given that we have the .env file checked in now, the push can just use that directly. I don't see why we need Yet Another file being uploaded from the workflow. Having said that, I am not going to hold up this PR for this, but it needs to be addressed as a polishing change.

Copy link
Contributor Author

@MukuFlash03 MukuFlash03 Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see now why we do not need the artifact for the push event. Detailed comments made here in public-dash Redesign PR. The comments talk about the corresponding artifact download in the dashboard repos and the explanation is applicable to the upload artifact on the server side as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MukuFlash03 I utilize the artifacts to get the tags to the internal repo, so please don't remove them yet!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataliejschultz For now, I've tested removing only the Download artifacts step in the public-dashboard workflow.
I don't think it is related to the Upload artifacts step that you added for internal repo.
Neither am I making changes to the server workflow.

Can you please confirm if in this case it is alright to remove the artifact dealing with Push event from the dashboard workflows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MukuFlash03 sorry, I misunderstood what you were changing. It is okay with me to remove the download artifacts step!

dispatch:
needs: build
runs-on: ubuntu-latest

env:
DOCKER_IMAGE_TAG: ${{ needs.build.outputs.date }}

strategy:
matrix:
repo: ['e-mission/op-admin-dashboard', 'e-mission/em-public-dashboard']

steps:
- uses: actions/checkout@v4

- name: Trigger workflow in admin-dash, public-dash
# TODO: Create Fine-grained token with "Actions: write" permissions
run: |
shankari marked this conversation as resolved.
Show resolved Hide resolved
curl -L \
-X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${{ secrets.GH_FG_PAT_TAGS }}" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://api.github.com/repos/${{ matrix.repo }}/actions/workflows/image_build_push.yml/dispatches \
-d '{"ref":"master", "inputs": {"docker_image_tag" : "${{ env.DOCKER_IMAGE_TAG }}"}}'
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ CFC_DataCollector/moves_collect.log
webapp/www/lib
conf/**/*.json
!conf/**/*.schema.json
!conf/analysis/debug.conf.internal.json

*.ipynb_checkpoints*

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ RUN bash -c "./.docker/setup_config.sh"

# #declare environment variables
ENV DB_HOST=''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, you could also set the default value of DB_HOST here instead of having the if check above

ENV WEB_SERVER_HOST=''
ENV WEB_SERVER_HOST=0.0.0.0

ENV LIVERELOAD_SRC=''
ENV STUDY_CONFIG=''
Expand Down
14 changes: 14 additions & 0 deletions conf/analysis/debug.conf.internal.json
shankari marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"intake.segmentation.section_segmentation.sectionValidityAssertions": true,
"intake.cleaning.clean_and_resample.speedDistanceAssertions": false,
"intake.cleaning.clean_and_resample.sectionValidityAssertions": false,
"intake.cleaning.filter_accuracy.enable": false,
"classification.inference.mode.useAdvancedFeatureIndices": true,
"classification.inference.mode.useBusTrainFeatureIndices": true,
"classification.validityAssertions": true,
"output.conversion.validityAssertions": true,
"section.startStopRadius": 150,
"section.endStopRadius": 150,
"analysis.result.section.key": "analysis/inferred_section",
"userinput.keylist": ["manual/mode_confirm", "manual/purpose_confirm", "manual/replaced_mode", "manual/trip_user_input"]
}
10 changes: 8 additions & 2 deletions emission/analysis/config.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
import json
import os

def get_config_data():
try:
print("Trying to open debug.conf.json")
config_file = open('conf/analysis/debug.conf.json')
Comment on lines +6 to 7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully convinced that this is the right approach. If we keep this backwards compat code around forever, we don't have any motivation for people to change to the new structure in the future. Having said that, I will not hold up the merge for this, but I do want to make sure that we have a plan to remove this in a year or so, and to notify users to change their config files, so that we don't end up with a bunch of hacky backwards compat code strewn all around the codebase.

except:
print("analysis.debug.conf.json not configured, falling back to sample, default configuration")
config_file = open('conf/analysis/debug.conf.json.sample')
if os.getenv("PROD_STAGE") == "TRUE":
print("In production environment, opening internal debug.conf")
config_file = open('conf/analysis/debug.conf.internal.json')
else:
print("analysis.debug.conf.json not configured, falling back to sample, default configuration")
config_file = open('conf/analysis/debug.conf.json.sample')
shankari marked this conversation as resolved.
Show resolved Hide resolved
ret_val = json.load(config_file)
config_file.close()
return ret_val
Expand Down
32 changes: 32 additions & 0 deletions emission/core/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import json
import logging
import os

def get_config_data_from_env():
config_data_env = {
"timeseries": {
"url": os.getenv('DB_HOST', "localhost"),
"result_limit": os.getenv('DB_TS_RESULT_LIMIT', 250000)
}
}
return config_data_env

def get_config_data():
try:
config_file = open('conf/storage/db.conf')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think we need this given that we are going to use the environment variable going forward and it falls back to localhost "url": os.getenv('DB_HOST', "localhost"),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also remove the sample file to avoid confusion since we don't use it. Note that before this change, we used to fall back to the sample file if the db.conf file did not exist, but we have removed that now, so it is not clear what benefit the sample file gives us in terms of working out of the box or anything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our logic here was that we wanted to maintain the functionality of db.conf for users running the server locally, in case they have it set in a certain way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be fine with retaining this for now as a backwards compat hack but it should be removed in the next release.

ret_val = json.load(config_file)
config_file.close()
except:
ret_val = get_config_data_from_env()
if ret_val["timeseries"]["url"] == "localhost":
print("storage not configured, falling back to sample, default configuration")
return ret_val

config_data = get_config_data()

def get_config():
return config_data

def reload_config():
global config_data
config_data = get_config_data()
12 changes: 3 additions & 9 deletions emission/core/get_database.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,10 @@
import os
import json

try:
config_file = open('conf/storage/db.conf')
except:
print("storage not configured, falling back to sample, default configuration")
config_file = open('conf/storage/db.conf.sample')
import emission.core.config as ecc

config_data = json.load(config_file)
url = config_data["timeseries"]["url"]
result_limit = config_data["timeseries"]["result_limit"]
config_file.close()
url = ecc.get_config()["timeseries"]["url"]
result_limit = ecc.get_config()["timeseries"]["result_limit"]

try:
parsed=pymongo.uri_parser.parse_uri(url)
Expand Down
5 changes: 2 additions & 3 deletions emission/integrationTests/start_integration_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@ cd /src/e-mission-server
echo ${DB_HOST}
if [ -z ${DB_HOST} ] ; then
local_host=`hostname -i`
sed "s_localhost_${local_host}_" conf/storage/db.conf.sample > conf/storage/db.conf
else
sed "s_localhost_${DB_HOST}_" conf/storage/db.conf.sample > conf/storage/db.conf
export DB_HOST=$local_host
echo "Setting db host environment variable to localhost"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when fixing the dockerized start script, we should fix this as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this environment variable set here anyway, since start_integration_tests.sh runs via a Dockerfile whose corresponding compose has the variable hardcoded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataliejschultz fair, but the integration tests are not the only way that people run the server. Developers run the server on their laptops to make changes as well. You need to now change the README to indicate that the environment variable should be set ...

fi
cat conf/storage/db.conf

Expand Down
29 changes: 22 additions & 7 deletions emission/integrationTests/storageTests/TestMongodbAuth.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,19 @@ def setUp(self):
self.uuid = uuid.uuid4()
self.testUserId = self.uuid
self.db_conf_file = "conf/storage/db.conf"
self.originalDBEnvVars = {}
self.createAdmin()

def tearDown(self):
self.admin_auth.command({"dropAllUsersFromDatabase": 1})
logging.debug("Deleting test db environment variables")
for env_var_name, env_var_value in self.testModifiedEnvVars.items():
del os.environ[env_var_name]
# Restoring original db environment variables
for env_var_name, env_var_value in self.originalDBEnvVars.items():
os.environ[env_var_name] = env_var_value
logging.debug("Finished restoring original db environment variables")
logging.debug("Restored original values are = %s" % self.originalDBEnvVars)
try:
os.remove(self.db_conf_file)
except FileNotFoundError as e:
Expand All @@ -67,14 +76,20 @@ def createAdmin(self):
self.admin_auth = pymongo.MongoClient(self.getURL(self.test_username, self.test_password)).admin

def configureDB(self, url):
config = {
"timeseries": {
"url": url,
"result_limit": 250000
}
self.testModifiedEnvVars = {
'DB_HOST' : url
}
with open(self.db_conf_file, "w") as fp:
json.dump(config, fp, indent=4)

for env_var_name, env_var_value in self.testModifiedEnvVars.items():
if os.getenv(env_var_name) is not None:
# Storing original db environment variables before modification
self.originalDBEnvVars[env_var_name] = os.getenv(env_var_name)
# Setting db environment variables with test values
os.environ[env_var_name] = env_var_value

logging.debug("Finished setting up test db environment variables")
logging.debug("Current original values are = %s" % self.originalDBEnvVars)
logging.debug("Current modified values are = %s" % self.testModifiedEnvVars)

def getURL(self, username, password, dbname="admin"):
return "mongodb://%s:%s@localhost/%s?authSource=admin&authMechanism=SCRAM-SHA-1" % (username, password, dbname)
Expand Down
28 changes: 9 additions & 19 deletions emission/net/api/cfc_webapp.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,27 +51,17 @@
import emission.storage.timeseries.cache_series as esdc
import emission.core.timer as ect
import emission.core.get_database as edb
import emission.net.api.config as enac

try:
config_file = open('conf/net/api/webserver.conf')
except:
logging.debug("webserver not configured, falling back to sample, default configuration")
config_file = open('conf/net/api/webserver.conf.sample')

OPENPATH_URL="https://www.nrel.gov/transportation/openpath.html"
STUDY_CONFIG = os.getenv('STUDY_CONFIG', "stage-program")

config_data = json.load(config_file)
config_file.close()
static_path = config_data["paths"]["static_path"]
python_path = config_data["paths"]["python_path"]
server_host = config_data["server"]["host"]
server_port = config_data["server"]["port"]
socket_timeout = config_data["server"]["timeout"]
log_base_dir = config_data["paths"]["log_base_dir"]
auth_method = config_data["server"]["auth"]
aggregate_call_auth = config_data["server"]["aggregate_call_auth"]
not_found_redirect = config_data["paths"].get("404_redirect", OPENPATH_URL)
enac.reload_config()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we reload the config only here and nowhere else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The STUDY_CONFIG environment variable is pulled directly above, so changing this value and reloading allows testers/users to set the variable to different configs in the container using export and test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement doesn't answer my question. I asked why the config is reloaded here and not anywhere else.

  • If we want testers/users to "set the variable to different configs in the container", why are we not concerned about people setting DB_HOST, for example.
  • I am not sure what you mean by "using export and test". The only way to restart a webapp in the container is to restart the container. At which point, the webapp will read the config again. Why do we need to reload here?

Copy link
Contributor

@nataliejschultz nataliejschultz May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the server with and without the reload. When I removed the reload, one of the tests failed: Screenshot 2024-05-23 at 11 11 49 PM This particular test calls the cfc_webapp.py. It appears that the reload is useful in this case for refreshing the value of "not_found_redirect"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nataliejschultz the screenshot is not loading for me (see screenshot below). We weren't re-loading the config before and the tests passed. I really don't see any reason why the config needs to be reloaded.

Screenshot 2024-08-06 at 6 42 04 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-05-23 at 11 11 49 PM @shankari here is the screenshot

static_path = enac.get_config()["static_path"]
server_host = enac.get_config()["server_host"]
server_port = enac.get_config()["server_port"]
socket_timeout = enac.get_config()["socket_timeout"]
auth_method = enac.get_config()["auth_method"]
aggregate_call_auth = enac.get_config()["aggregate_call_auth"]
not_found_redirect = enac.get_config()["not_found_redirect"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this ever tested? I don't think that it would have ever worked.
The format of the webserver.conf does not have the static_path at the top level

{
  "paths" : {
    "static_path" : "webapp/www/",
    "python_path" : "main",
    "log_base_dir" : ".",
    "log_file" : "debug.log"
  },

And we didn't modify the path before returning it

    try:
        config_file = open('conf/net/api/webserver.conf')
        ret_val = json.load(config_file)
        config_file.close()
    except:


BaseRequest.MEMFILE_MAX = 1024 * 1024 * 1024 # Allow the request size to be 1G
# to accomodate large section sizes
Expand Down
36 changes: 36 additions & 0 deletions emission/net/api/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import json
import logging
import os

def get_config_data_from_env():
config_data_env = {
"static_path": os.getenv('WEB_SERVER_STATIC_PATH', "webapp/www/"),
"server_host": os.getenv('WEB_SERVER_HOST', "0.0.0.0"),
"server_port": os.getenv('WEB_SERVER_PORT', "8080"),
"socket_timeout": os.getenv('WEB_SERVER_TIMEOUT', "3600"),
"auth_method": os.getenv('WEB_SERVER_AUTH', "skip"),
"aggregate_call_auth": os.getenv('WEB_SERVER_AGGREGATE_CALL_AUTH', "no_auth"),
"not_found_redirect": os.getenv('WEB_SERVER_REDIRECT_URL', "https://www.nrel.gov/transportation/openpath.html")
}
return config_data_env


def get_config_data():
try:
config_file = open('conf/net/api/webserver.conf')
ret_val = json.load(config_file)
config_file.close()
except:
# if check_unset_env_vars():
logging.debug("webserver not configured, falling back to sample, default configuration")
ret_val = get_config_data_from_env()
return ret_val

config_data = get_config_data()

def get_config():
return config_data

def reload_config():
global config_data
config_data = get_config_data()
6 changes: 5 additions & 1 deletion emission/net/auth/secret.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@

class SecretMethod(object):
def __init__(self):
key_file = open('conf/net/auth/secret_list.json')
try:
key_file = open('conf/net/auth/secret_list.json')
except:
print("secret_list.json not configured, falling back to sample, default configuration")
key_file = open('conf/net/auth/secret_list.json.sample')
key_data = json.load(key_file)
key_file.close()
self.client_secret_list = key_data["client_secret_list"]
Expand Down
37 changes: 37 additions & 0 deletions emission/net/ext_service/push/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
import json
import logging
import os

def get_config_data_from_env():
config_data_env = {
"provider": os.getenv("PUSH_PROVIDER"),
"server_auth_token": os.getenv("PUSH_SERVER_AUTH_TOKEN"),
"app_package_name": os.getenv("PUSH_APP_PACKAGE_NAME"),
"ios_token_format": os.getenv("PUSH_IOS_TOKEN_FORMAT")
}
return config_data_env

def get_config_data():
try:
config_file = open('conf/net/ext_service/push.json')
ret_val = json.load(config_file)
config_file.close()
except:
logging.warning("net.ext_service.push.json not configured, checking environment variables...")
ret_val = get_config_data_from_env()
# Check if all PUSH environment variables are not set
if (not any(ret_val.values())):
raise TypeError
return ret_val

try:
config_data = get_config_data()
except:
logging.warning("All push environment variables are set to None")

def get_config():
return config_data

def reload_config():
global config_data
config_data = get_config_data()
Loading
Loading