Skip to content

Commit

Permalink
fix: update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
cdaein authored Jul 20, 2024
1 parent dea8e0b commit 3380445
Showing 1 changed file with 10 additions and 6 deletions.
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,6 @@

Automataic montage video generation with Google Gemini LLM. You can ask it to search for moments of a video (ex. "Find every moment the speaker says _something_") and it will respond with timestamps. Then, corresponding video clips are rendered and stitched together to create a montage video. You can also input multiple videos, then the program will generate a supercut video.

## Example

I uploaded a 13-minute long public domain documentary [A Bronx Morning (1931)](https://www.loc.gov/item/2021604036/) and asked to "find street signages." Supercut created the following montage video:

<video width="320" height="240" src="https://github.com/user-attachments/assets/e5335458-ab37-406e-a9ce-020c99f89a19"></video>

## How to install

1. ffmpeg is required to create video. on Mac: `brew install ffmpeg`
Expand Down Expand Up @@ -116,6 +110,16 @@ By default, the script looks for the most common resolution and scale and/or cro
- Each timestamp (and thus generated video clip) will be 1 second or longer because Gemini can only look at video at 1fps. Using `--buffer <negative_value>` option can generate shorter clips but due to video keyframing issue, there may be issues such as frozen frames.
- You may get a better result (but slower) by using `gemini-1.5-pro` model instead of the default `gemini-1.5-flash` but beware of [the usage limit on the free tier](https://ai.google.dev/pricing).

## Examples

I uploaded a 13-minute long public domain documentary [A Bronx Morning (1931)](https://www.loc.gov/item/2021604036/) and asked to "find street signages." Supercut created the following montage video:

<video src="https://github.com/user-attachments/assets/e5335458-ab37-406e-a9ce-020c99f89a19"></video>

I uploaded 6 animated films from the silent film era found from The Library of Congress collection, and extracted "text sound effects." Some timestamps were unrelated and had to be removed manually before creating a montage:

<video width="320" height="240" src="https://github.com/user-attachments/assets/4115d49c-14be-45a8-9a1e-3427d6ed65de"></video>

## References

- [Gemini Error Codes](https://ai.google.dev/gemini-api/docs/troubleshooting#error-codes)
Expand Down

0 comments on commit 3380445

Please sign in to comment.