With the successful construction of a GPT4 co-worker, we can now talk to emacs. (And also understand how disturbingly simple it is to build one – read on!)
WARNING The AI can make mistakes and you might say something horrible on accident like “Man, I hate my harddrive!” I hope you understand the ramifications 😀
With a voice interface, ask the AI to do work or answer questions for you in the context of the current buffer (file/directory). The AI will do the job asynchronously out of the way, leaving you to move on to the next task while the AI plugs away and speaks to you about its completed task.
The code is posted at https://github.com/pv-pterab-s/emacs-pinky-saver. It requires some work to set up (which we won’t cover here), but here are some highlights:
- This was disturbingly simple to set up using the amazing emacs package https://github.com/karthink/gptel and shell scripts.
- Realtime audio capture, speech-to-text, and text-to-speech are insanely simple with OpenAI’s web API’s (https://platform.openai.com/).
- See the shell scripts in the GitHub repository linked above.
- This project requires access to the
gpt-4-1106-previewmodel. All other models could not conform to the prompt driving this assistant.
- This project demonstrates the speed at which an AI will react when used to do work. The speech-to-text and text-to-speech were surprisingly fast!
This package functions by giving GPT4 an initial prompt, executing the
bash script that it produces, returning STDOUT and STDERR to GPT4, and iterating until GPT4 replies with the string
REPLY. The string after
REPLY is then synthesized to voice.
The prompt I found that facilitates the above is as follows. Note that the order of statements matters!
In this conversation, we seek to satisfy the english instruction `<INSTRUCTION>` by executing bash shell script code. In this conversation, you will reply with bash code that I will then execute. I will record the STDOUT and STDERR streams that the code produces and send it back to you. You will consider the outputs and reply with more bash code (if needed) to continue satisfying the instruction. We will iterate together in this fashion - essentially giving you shell access to my computer.
Only reply with bash code. Do not reply with any formatting like backticks. I need to be able to execute what you send me without reformatting or filtering.
Assume that any ambiguous references in the english instruction always resolve to one of: a filename, a directory name, a variable name in file, or a function name in a file. The context of the instruction is the directory named `<DIRECTORY>`. Thus, before writing bash code to fulfill the instruction, you must write bash code to collect enough information to define any ambiguous references in the english instruction. Always assume ambiguous references resolve to _something_ - you've just got to collect enough information to figure it out.
After you have fulfilled the instruction, reply in english by writing bash code that echo's the word `REPLY` followed with the reply. Avoid multi-line replies or replies that are very long. I will play back the reply using text to speech software.
I hope that this is enlightening! Ping me at email@example.com if you need help setting this up. Work is continuing; hopefully, this will be a full-fledged package soon!
Aside: The next step is to add another parallel AI as an adversary to double-check the primary AI’s work. Currently, the AI will make mistakes and believe it is complete. I think this setup will prove more successful when put in a feedback loop.