Scripting API for Ghostty #2353

pomdtr · 2023-10-01T14:34:51Z

pomdtr
Oct 1, 2023

Some inspiration:

wezterm expose it's api through the cli
iterm can be controlled through applescript
warp allows to configure "Launch Configurations" and launch them through protocol handler
kitty remote control

mitchellh · 2023-11-06T17:31:54Z

mitchellh
Nov 6, 2023
Maintainer

Just dropping down some thoughts. I'm leaning towards a single-line text protocol in the format of memcached/redis/etc. This has various benefits:

Easy to debug. You can telnet and just start typing as an interactive repl.
Easy to build API clients for
Easy to implement and fast to parse
Can run over a variety of transports
Can be easily secured via TLS (on TCP)
Can be password-protected or if we ever want to go big we can support TLS client certificates

0 replies

mitchellh · 2024-01-22T15:59:19Z

mitchellh
Jan 22, 2024
Maintainer

I know GitHub links it, but I want to explicitly note @LordMZTE's proposal for dynamically linked plugins in #1358. See my comment there, I don't think these are mutually exclusive, but I did want to centralize all potential plugin discussion in one place for now until we spin out actual working on some of these. And I'm not ready to commit to working on these yet.

0 replies

RGBCube · 2024-02-17T18:00:05Z

RGBCube
Feb 17, 2024

I'd like to propose another way to interact which are escape sequences, we could make another standard which specifies how it is used.

The way it would work is simple, you just echo stuff in your terminal and Ghostty (and other standard implementors) parse it and apply the settings.

I don't know how the Kitty keyboard protocol works, but it would be similar, just print your desired setting, and it gets applied:

echo $"($escape_start)background($escape_sep)#800000($escape_end)"

This would be really useful over SSH, and could prevent you from accidentally deleting the primary database instead of the backup one because you selected the wrong terminal emulator window.

This isn't a replacement for the socket way, btw.

Another idea is that the background-color and other common settings could be documented in the standard, so all terminal emulators that implement the standard work with a single shell command when added to the shellrc.

For emulators that don't support it, the escape sequences could be picked carefully so it doesn't change output and just shows as a normal echo statement. Like %CFG-START$background%CFG-SEP$#800000%CFG-END$.

3 replies

akx Jan 3, 2025

Yeah, iTerm2 has a couple VT sequences for this: #4391

hartzell Jan 15, 2025

In iTerm2, I often manually run:

echo -e "\033]1337;SetColors=preset=Grass\a"

(varying the scheme) to make a particular window stand out.

jakenvac Jan 22, 2025

I like this approach because it allows any program (even outside of ghostty) to write escape codes to the tty of a given terminal.

I use this feature with wezterm and aerospace to integrate my terminal and window manager keybinds.

Wezterm has a custom escape sequence to set user vars, which you can use to trigger event handlers in the config. If my window manager detects that wezterm is focused, it uses the wezterm cli to query the active tty and then write escape sequences to it.

That said, I think this works nicely alongside a more robust api/cli rather than instead of as it's a bit of a one way system without hacks.

tale · 2024-04-16T20:24:11Z

tale
Apr 16, 2024

Just building on the conversation about making a text protocol. Would it be cursed to do something like <COMMAND> named_arg1=value1 named_arg2=value2 named_arg3=value3? I'm thinking really far into the future but let's assume I want to implement an API command that opens a new vertical split, something like RUN action=new_split:right OR RUN action=new_split dir=right. It's very simple and mimics exactly what keybinding syntax looks like in the configuration file.

A few problems/questions that this kind of syntax raises:

How do we support advanced parameters like maps (or is that even a good API decision)?
Should API control be stateless or does it make sense to return data to track the ID of a surface?
- This could also tie into some very far-future idea where QUERY (or something similar) is a command.
- What would a response format even look like if this is desired?
Should multiple commands at once be supported just delimited by newlines?

I chose something overly simple because like mentioned above, it can be easily debugged and building an API client with it is as simple as building a parser in the language and moving forward with normal TCP connections.

I think at the very least following the naming scheme of keybinds is the best choice because then you don't need to reinvent the wheel and there's consistency between these two. The one downside is we already run into consistency issues because in a keybind you would do keybind = super+ctrl+up=resize_split:up,10. Those aren't named arguments so one or the other would need to change to conform.

0 replies

jcollie · 2024-04-16T20:52:28Z

jcollie
Apr 16, 2024

Why invent a new protocol? Why not use HTTP & JSON over Unix sockets? Every language worth mentioning has HTTP & JSON support already (although it may take a bit of work to talk to a Unix socket) and it's basically infinitely extensible. That also means that we don't need to spend weeks bikeshedding a new protocol. Even cURL works over Unix sockets.

curl --unix-socket $GHOSTTY_API_SOCKET http://localhost/v0/keybinds | jq

0 replies

LordMZTE · 2024-04-16T21:04:06Z

LordMZTE
Apr 16, 2024

Why invent a new protocol? Why not use HTTP & JSON over Unix sockets? Every language worth mentioning has HTTP & JSON support already (although it may take a bit of work to talk to a Unix socket) and it's basically infinitely extensible. That also means that we don't need to spend weeks bikeshedding a new protocol. Even cURL works over Unix sockets.

curl --unix-socket $GHOSTTY_API_SOCKET http://localhost/v0/keybinds | jq

I understand the use of JSON here for a uniform and simple standard, but HTTP doesn't seem logical to me, mostly because we could just use a streaming JSON parser (Zig can be made to do this IIRC) to directly transfer JSON over the socket. It would also seem to me that most HTTP clients aren't exactly easy to convince to connect to a Unix socket, as it's simply not something HTTP was ever intended to do, as further made evident by the lack of a way to express this in a URL. Your example could translate to something as simple as:

echo '{"request": "keybinds"}' | socat $GHOSTTY_API_SOCKET - | jq

(Note: the JSON here is obviously just a placeholder and isn't meant to suggest a possible way a request could look.)

0 replies

tale · 2024-04-16T21:09:02Z

tale
Apr 16, 2024

JSON makes sense, honestly it wasn't my first choice because I was looking at redis, which is just raw commands. Need to confirm, but is there an overhead to using JSON parsing? Do we need the power of JSON either?

I think following the keybind syntax is a good place to start because it's simple and shares a mental model in 2 different places. Introducing JSON basically opens up the can of worms for super complicated API bodies which I personally think is just a sign of bad design.

0 replies

jcollie · 2024-04-16T21:30:59Z

jcollie
Apr 16, 2024

Everyone hates JSON, because everyone has had to deal with JSON. It's the LCD for passing data between systems. I think that the overhead of JSON parsing is irrelevant unless you're expecting hundreds of API calls per second. Yeah JSON does open you up to the possibility of some truly horrific data structures, but the same could be said for trying to cram everything into a bespoke REDIS-like API. The point of using HTTP & JSON is that the best code is the code that you don't have to write. Each language's HTTP & JSON code is thoroughly vetted by a large community whereas a bespoke protocol will probably only ever be looked at by a very small number of people.

Plus cURL & wget mean that you can access the API from a simple shell script as well.

0 replies

tale · 2024-04-16T21:35:24Z

tale
Apr 16, 2024

telnet works on simple shell scripts! You can even test this with redis, if you telnet and just send PING it'll respond with pong. I'm not trying to cram everything into a "bespoke REDIS-like API". My suggestion stems from supporting an API that's agnostic so that the OSC way of invoking it is possible too. The keybinding action syntax already exists, we should follow it.

0 replies

jcollie · 2024-04-16T21:36:22Z

jcollie
Apr 16, 2024

The use of HTTP makes things simpler as it's a very common design pattern for such things. I suggested using Unix sockets as a security measure but binding to a random port on localhost is an option as well. On Linux it would theoretically be possible to use CGroups and network namespaces to futher limit access to the localhost port but that might be more complication than necessary.

0 replies

RGBCube · 2024-04-17T05:29:25Z

RGBCube
Apr 17, 2024

Why not use D-Bus?

0 replies

mitchellh · 2024-04-17T13:22:58Z

mitchellh
Apr 17, 2024
Maintainer

Why not use D-Bus?

This needs to be cross platform. Additionally, I’ve found dbus pretty complicated to use compared to a simple text proto.

1 reply

mmlb Jan 7, 2025

systemd folks are liking https://varlink.org/ over dbus and I believe I've seen some other projects start to use it too. Not sure if it makes sense here but wanted to give a dbus alternative that I think we will see more of in Linux.

jcollie · 2024-04-17T14:55:27Z

jcollie
Apr 17, 2024

Technically, we're already using D-Bus since it's required by GTK. I wouldn't recommend it as the primary API endpoint. At some point we'll need to make more use of D-Bus to achieve deeper integration with Gnome (such as being able to link into the Gnome search interface) so at that point it may make sense to expose the Ghostty API over D-Bus in some way.

0 replies

cryptocode · 2024-07-14T19:20:05Z

cryptocode
Jul 14, 2024

Is the plan also to expose this functionality via Ghostty's cli? I find myself wanting to port stuff like this wezterm/helix integration to Ghostty - i.e. making it easy to use the API from scripts would be great.

0 replies

hauleth · 2024-08-06T07:37:09Z

hauleth
Aug 6, 2024

I would start with simple text API over Unix socket. Question is whether it should be stream socket or datagram socket. It could also be exposed within Ghostty session itself via environment variable like $GHOSTTY_SOCKET. With datagrams it make it easy to separate commands from each other, but IIRC it would limit the length of single message.

Usage of Unix sockets would also allow us to share extra data via SCM_RIGHTS and similar, access API from clients, that aren't attached to Ghostty session, etc.

0 replies

tale · 2024-08-10T21:51:51Z

tale
Aug 10, 2024

If it's desired, you could resume #1701, I no longer have the time to finish it unfortunately because of work but I'd be more than happy to help pass it on to someone else.

0 replies

mitchellh · 2024-08-10T21:55:42Z

mitchellh
Aug 10, 2024
Maintainer

Thanks @tale ❤️

0 replies

CosmicToast · 2024-12-27T11:46:16Z

CosmicToast
Dec 27, 2024

In the spirit of Ghostty aiming to feel native on every platform, it seems to me that a platform-native API (per-platform) approach isn't unreasonable, that way that part of the code can also be platform-specific (rather than in libghostty). This would mean dbus on Linux, AppleScript on MacOS, and so on; none of them conflicting.

I'm going to give adding AppleScript support a shot just to see what it's like (I've been waiting for an opportunity to try out Swift anyway). Worst case there's no community interest and it never gets merged. How do you feel about that approach @mitchellh?

To wit, I already have some core dictionary items defined (e.g. you can tell Ghostty to open a binary/script; it's the default behavior but it's now documented in the Script Editor dictionary), so I'll see how far I can get from there.

7 replies

CosmicToast Dec 28, 2024

Fair enough, I won't look into it any further then :)
As for preferences (for OS-agnostic), I would most like varlink and a ctl style utility to help make it scriptable without writing a client.

adamcstephens Dec 28, 2024

Don't take my opinion as final, I'm not a decider here. :) But may be worth getting some more feedback before proceeding in any direction.

uchuugaka Jan 1, 2025

Shortcuts and AppleScript could exist externally, to wrap whatever is used.
iTerm uses Python, but to be honest, it is so so.
JSON's big disadvantage is lack of real comments.
XML is verbose, but could work well and be validated.
shell scripts could be the easiest to support, but have lots of issues of their own.
It would be great to have a non-text UI on macOS for things.
Perhaps TextMate holds some inspiration for the variety of pragmatic usage of various languages and tools, but this adds implicit dependencies always, where shell scripts come back into view.

timvisher Jan 1, 2025

FWIW one of the primary components of an application being "native" on macOS to me is support for Mac Automation. This is a major missing component of non-Terminal.app/iTerm.app terminals that I've tried and always requires significant hacks to workaround.

Anticipated support for this was one of the major reasons I was excited about Ghostty since Alacritty requires GUI scripting or hacking things together via the alacritty msg interface. Obviously this is an open source project and so my own excitement for the feature doesn't mean it'll get built but IMO Ghostty will never be 'truly native' unless it has support for Mac Scripting.

EdmundsEcho Jan 2, 2025

I agree with the sentiment and what it means to be "truly native" needing to work with Mac Scripting. I wonder if ghostty might want to change it's objective/claim if this feature is not part of the plan. If only to set expectations and otherwise clarify.

LukaHedtSV · 2025-01-02T00:26:39Z

LukaHedtSV
Jan 2, 2025

It would definitely be a bonus for my setup to be able to launch Ghostty via a sh command.
Currently I use Fork as my main Git client, which has an "Open Repo in Terminal" feature (but doesn't recognise Ghostty or Wezterm as available terminals¹)

As it is, I have a sh command that lets me open Wezterm to the current repo baked into Fork settings, but I can't replicate it for Ghostty just now².

It's not the end of the world, and this is definitely the most basic version of the feature being discussed here, but we have a lot of repos in our workplace setup and sometimes I get bored typing cd to new ones :p

I asked the Fork devs about this and they didn't seem interested in supporting every new terminal that popped up in MacOS (I'm not sure exactly how they register these internally, they support Alacritty, Warp, Kitty and iTerm2 out of the box, but not Wezterm for reasons I don't understand). ↩
As it is, launching the ghostty.app item with any parameters just crashes Fork instead, uncertain why. Given it's clearly not a supported Ghostty feature, I have no desire to try and chase down the reason. ↩

2 replies

timvisher Jan 8, 2025

FWIW at least on macOS it's possible to get this behavior. You just have to resort to GUI scripting which is its own special hell. ¯\_(ツ)_/¯

One of the versions of the AppleScript I've written to do this is as follows:

My 'Ghostty' Script Library on one of my boxes

on runCommandInteractively(theCommand)
	tell me to makeNewWindow()
	tell application "System Events"
		keystroke (theCommand as text)
		keystroke return
	end tell
end runCommandInteractively

-- runCommandInteractively("echo ohai && sleep 5 && exit")

on makeNewWindow()
	if application "Ghostty" is running then
		tell application "System Events"
			set visible of application process "Ghostty" to true
			delay 0.1
		end tell
	end if
	
	tell application "Ghostty" to activate
	
	tell application "System Events"
		set preWindowCount to count of windows of application process "Ghostty"
		keystroke "n" using command down
		set tryUntil to (current date) + 5
		repeat while preWindowCount is less than (count of windows of application process "Ghostty")
			delay 0.1
			if tryUntil is less than (current date) then
				error "whoops"
			end if
		end repeat
	end tell
end makeNewWindow

on activateOrMakeNewWindow()
	if application "Ghostty" is running then
		tell application "System Events"
			set visible of application process "Ghostty" to true
			delay 0.1
		end tell
	end if
	
	tell application "Ghostty" to activate
	
	tell application "System Events"
		if 0 is equal to (count of windows of application process "Ghostty") then
			tell me to makeNewWindow()
		end if
	end tell
end activateOrMakeNewWindow

-- activateOrMakeNewWindow()

I say one of because on another box I'm not on right now this has gotten more complicated because GUI scripting.

LukaHedtSV Jan 8, 2025

Aaaaah, I hadn't intended to learn apple script, but appreciate it!
It's not so bad now as it forces me to learn to use ZOxide properly... But yeah it'd be a neat feature anyway

Cheers!

pjv · 2025-01-02T19:16:43Z

pjv
Jan 2, 2025

I know this approach wouldn’t really be an API per sé, but for my own purposes in scripting ghostty, what I'd most prefer is that various actions like creating new windows, tabs, splits, sending text to a pane, etc. were built into the CLI so that you could hit them as commands from shell scripts in arbitrary shells / languages.

I have some elaborate AppleScripts for iTerm that set up my working environment with a lot of tabs for different remote hosts, and each tab has a horizontal split and a different command run in each of the panes. Setting this up manually every time would take 5 minutes. Having the script do it takes 30 seconds and i just have to watch. Using AppleScript is clunky but manageable - I only had to write those scripts once. But I’d sure rather write them in fish or bash or python if I could.

Not having any kind of scriptability is the only thing that’s currently keeping me from switching from iTerm to Ghostty and I am looking forward to being able to switch because iTerm is feeling a bit bloated and ghostty feels nice and light and snappy.

2 replies

timvisher Jan 8, 2025

FWIW anything you can do in the Ghostty GUI is possible to do via AppleScript. Just not with native script terminology. See my comment here #2353 (reply in thread).

pjv Jan 9, 2025

jah, i know one could but i don’t know why one would. spending hours figuring out how to make applescript hit app menus, pause, send key combos, pause, tell other processes to tell processes to bla bla bla, etc. …is just a bridge too far for me. Especially when those menu elements inevitably change names along the way and then the script mysteriously stops working.

“its own special hell” indeed. Already too much hell getting good code on my screen. I don’t want to experience hell for terrible code. I’ll just trust that some not-impossibly-kludgy way to script ghostty is coming down the pike sometime soon.

micahcantor · 2025-01-03T19:35:03Z

micahcantor
Jan 3, 2025

Agreed with @pjv here. A command line interface seems like the most straightforward approach.

Additionally, it seems like the CLI needs to be updated, since it documents that actions like new_tab and new_window are supported when they are actually aren't. For instance, ghostty +list-actions includes new_tab, but ghostty new_tab just prints the usage text. This led to my confusion in this question on Discord.

1 reply

timvisher Jan 8, 2025

I at least attempted to clarify this with #4116. It caught me, too. :\

bjesus · 2025-01-03T22:00:03Z

bjesus
Jan 3, 2025

Just adding a use-case here: on tmux and then on WezTerm I have been using a similar (cli) interface to get the content of the active "pane", read the last line of it, and display it in my status bar (waybar). This has been super useful when running long tasks, as I can move to other work and hide the terminal completely, and yet see what's the latest in regard to the command I ran. The WezTerm way was wezterm cli get-text | grep -v -e '^$' | tail -n1, on tmux it was tmux capture-pane -p | .... Would be great to be able to do something similar with Ghostty.

0 replies

mdaniel · 2025-01-03T22:50:04Z

mdaniel
Jan 3, 2025

Out of curiosity, don't the needs of programmatically controlling windows, tabs, and splits align with #1935? While digging into the actual spec, I observed that Control Mode also has notifications, meaning the integration could be bidirectional if one wished (e.g. Ghostty could inform listeners of external changes)

Pro

leverages the presumably monster amount of work required to parse and integrate Control Mode parsing and operations (list-sessions, list-windows, list-panes, new-window,...)
the tmux "api" has likely seen almost every use case that a user would need (although I mention below that if tmux's API becomes a _subset_ of the needs, that may be no good)
to speak to the other mentioned idea of driving the api via a DSC sequence, tmux Control Mode already has that handshake for starting and stopping the mode
Control Mode was actually designed by George for iTerm2, and thus it likely has considered the amount of work required to implement the operations inside a terminal emulator

Con

probably a worse DX than the more established redis-esque or http protocol ideas since Control Mode was really designed for machine-to-machine communication and not human-to-machine
the likely "uncanny valley" of mismatches between the tmux protocol and Ghostty's supported options
- a sub-risk of this appears any time one tries to adopt a protocol from one domain into a related domain: protocol "enhancements" or subtle "yes, but" behavior

1 reply

rhodes-b Jan 7, 2025

Out of curiosity, don't the needs of programmatically controlling windows, tabs, and splits align with #1935? While digging into the actual spec, I observed that Control Mode also has notifications, meaning the integration could be bidirectional if one wished (e.g. Ghostty could inform listeners of external changes)

Per #2353 (comment) it is much more generic interface that goes beyond just windows / tabs / splits, like setting keybinds programmatically. Since that system is the goal anyways everything should use it

justyn · 2025-01-14T11:29:52Z

justyn
Jan 14, 2025

This discussion has been marked as an answer to #4870 which asks for a programmatic capability to access the scrollback buffer. Particularly thinking of tools like kitty-scrollback.

Because the title of the discussion here only references windows/tabs/splits but the substance is actually about remote communication for ghostty more generally, I didn't spot it previously.

Perhaps this discussion could be renamed?

4 replies

pomdtr Jan 15, 2025
Author

Hey, I'm the original poster. I did not really follow all the talks here, any suggestions for the rename ? "Remote Communication API" ?

justyn Jan 15, 2025

Yes or perhaps "API for managing layout, buffer and other terminal functionality"?
Only thinking that "remote communication" might perhaps be misunderstood.

00-kat Jan 15, 2025

"API for programmatically controlling Ghostty" might work too.

pomdtr Jan 15, 2025
Author

Maybe "Scripting API" ?

susl · 2025-01-23T19:12:37Z

susl
Jan 23, 2025

Hi, this thread was marked as an answer to #3795 .
So I just wanted to add that it'll be beneficial to add "Launch Quick Terminal" to the CLI API as well. Thanks!

0 replies

Scripting API for Ghostty #2353

Replies: 25 comments · 21 replies

mitchellh Nov 6, 2023 Maintainer

mitchellh Jan 22, 2024 Maintainer

mitchellh Apr 17, 2024 Maintainer

mitchellh Aug 10, 2024 Maintainer

Footnotes

pomdtr Jan 15, 2025 Author

pomdtr Jan 15, 2025 Author

Replies: 25 comments 21 replies

mitchellh
Nov 6, 2023
Maintainer

mitchellh
Jan 22, 2024
Maintainer

mitchellh
Apr 17, 2024
Maintainer

mitchellh
Aug 10, 2024
Maintainer

pomdtr Jan 15, 2025
Author

pomdtr Jan 15, 2025
Author