Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
luiztauffer authored Nov 29, 2023
1 parent 5c09be5 commit c87115e
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion containers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,11 +105,23 @@ If having difficulties pushing the image to ECR:
The job submission must include the function arguments, which are passed and ENV vars to the running container.
See [examples](https://github.com/catalystneuro/spikeinterface_cloud/tree/main/examples) of how to submit jobs with Python scripts.
If the job requires GPU, make sure to pass this extra pair of ENV variables:
```
environment_variables.append({
'name': 'NVIDIA_DRIVER_CAPABILITIES',
'value': 'all'
})
environment_variables.append({
'name': 'NVIDIA_REQUIRE_CUDA',
'value': 'cuda>=11.0'
})
```
# Debugging AWS Batch
AWS Batch uses a series of other AWS services under the hood. If something doesn't work as expected (e.g. a job gets stuck as RUNNABLE), there are several places to find the possible causes of failure:
- AWS Batch > Jobs > specific-job-details
- EC2 > Auto Scaling groups > specific-asg-details
- EC2 > Instances > specific-instance-details > Monitoring
- CloudTrail > Event history
- CloudTrail > Event history

0 comments on commit c87115e

Please sign in to comment.