-
Notifications
You must be signed in to change notification settings - Fork 2
UTD-CRSS/exploreapollo-import-audio
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project contains various scripts for uploading data and files. To use, copy src/config-sample.py to src/config.py, and fill in the dummy values. Note that values relating to Flickr are only necessary for ImageUpload. To run, python TransferS3Data.py [channel met_start met_end] python TransferS3Metrics.py python AudioUpload.py <local base folder> <S3 base folder> python MetricsUpload.py <local base folder> <S3 base folder> python ImageUpload.py flickr <Flickr album id> <S3 folder> <Mission name> [media attachable csv] python ImageUpload.py local <local base folder> <S3 folder> <Mission name> [media attachable csv] python ImageUpload.py <media attachable csv> python storyUpload.py <.csv file with extension> TransferS3Data.py - Finds transcript and wave files in S3 and puts audio segment and transcript items in the database. TransferS3Data takes three optional arguments - channel, met_min, and met_max. Providing these arguments limits the processed files to those referring to the given channel, and labeled as starting sometime between met_min and met_max. Note that this will skip any files with data between met_min and met_max, but begin before met_min! TransferS3Metrics.py - Finds .json files in S3 and puts Metric items in the database. TransferS3Metrics takes three optional arguments - channel, met_min, and met_max. Providing these arguments limits the processed files to those referring to the given channel, and labeled as starting sometime between met_min and met_max. Note that this will skip any files with data between met_min and met_max, but begin before met_min! AudioUpload.py - Finds transcript and wave files on the local machine, uploads them to S3, and puts audio segment and transcript items in the database. AudioUpload uploads all wav and txt files found in the local base folder, and preserves the directory structure as much as possible. If the call looks like python AudioUpload.py basefolder/ audio/ then a file basefolder/subfolder/file.txt is uploaded as audio/subfolder/file.txt in S3. MetricsUpload.py - Finds metric files (.json) on the local machine, uploads them to S3, and puts metric entries into the database.. MetricsUpload finds all .json files found in the local base folder, and preserves the directory structure as much as possible. If the call looks like python MetricsUpload.py basefolder/ metrics/ then a file basefolder/subfolder/file.txt is uploaded as metrics/subfolder/file.txt in S3. ImageUpload.py - When given 'flickr' as the first argument, copies a Flickr album over to S3 and fills in database information. Providing the optional csv file will also attach the images to moments and channels specified in the csv file. For an example of such a csv file, see src/examples/mediaAssociation. The image titles referenced there must match those found in Flickr. The script will also upload local files, by changing the first argument to 'local'. Finally, providing only a csv file as an argument attaches images already in the database to moments and channels that are already in the database. storyUpload.py - Takes a csv file (check out template in examples/Splashdown.csv) and uploads corresponding moments & story. The story name is the name of the .csv file (Splashdown.csv -> "Splashdown" story). The moments are listed as rows, and there is one row used for the description of the story itself. You can have empty rows to separate the moments for readability. The script will not upload the story and the moments unless each moment has audio/transcript data somewhere in its (met_start,met_end) interval. Make sure that the .csv file is in the same directory as the storyUpload.py file. So to run the Splashdown example first you would have to 'cp examples/Splashdown.csv Splashdown.csv' then 'python3 storyUpload.py Splashdown' FileManager.py - no longer used. ExtractMET.py - used ONLY to create a11-timestamps.csv, an attempt to extract MET times from a webpage. Read the file for more info. audio-upload.py - meant to iterate through a directory of audio clips and combine two with corresponding names before taking a clip from a specified time frame and outputting it as two 5-minute-long files. This is done because multi-channel audio only supports the upload of audio files 5 minutes or less in duration. Before running the script, use a .env file to set the appropriate AWS access key and secret key. After writing those into the .env file call 'source .env' to make them accessible in the python script. The script takes command line arguments to run. These arguments are as follows: startTime, endTime, readDirectory, writeDirectory. Start time and end time are ints and read and write directories are strings that correspond to directories. Read directory is referring to the local directory the audio is stored in and write directory is the directory in s3 that the audio is being uploaded to. An example command could look like this: 'python audio-upload.py 1500 2100 audio Lunar_Sighting' Save the output of the script for the next step. This script was made very specifically to work with the audio as we were given it in the fall of 2021, but components of it can be repurposed if this changes. For more detailed information, check the wiki page on GitHub at https://github.com/UTD-CRSS/exploreapollo-import-audio/wiki
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published