Creating a 24/7 AI-Generated Twitch Stream in Go
I want to demonstrate with this experiment how easy it is to mesh some services and AI model functionalities to create engaging technological experiences. The software powering this 24/7 stream is open-source here: https://github.com/LLuMus/lulis.
Here is a sample of the project: https://vimeo.com/886109183?share=copy if you're lucky and I didn't run out of money, you might catch it online here: https://www.twitch.tv/pergunta_lula
This software does the following:
- It uses FFMPEG to start an RTMP stream to Twitch with a loop video.
- It observes Twitch chat messages of a Twitch channel and detects messages that begin with our desired prefix (Lula, ).
- Once we have a message that is relevant and does have common abuse words in it, we place it into a queue.
- We dequeue messages from this queue and start processing them, which first asks ChatGPT 3.5 Turbo with a predefined prompt to impersonate Lula and answer the user's question.
- We take the response in text and send it to a TTS service called Eleven Labs (https://elevenlabs.io/), where I previously created a cloned voice.
- Then we use https://replicate.com/ to use a model called Wav2Lip, which syncs a short video or image to an audio. It needs a hosted audio, so we use AWS S3 to store the audio files first.
- The output of all these steps is an mp4 video with a response to the user, which we interlude into the loop, and the watchers of the stream get to see it in real-time.
The goal is to create the experience of interacting with a live-streaming AI character that can engage a crowd. All the details of how to run the software should be found in the project's README.
- TWITCH_CHANNEL_NAME=${LULIS_TWITCH_CHANNEL_NAME}
- TWITCH_STREAM_KEY=${LULIS_TWITCH_STREAM_KEY}
- TWITCH_CLIENT_ID=${LULIS_TWITCH_CLIENT_ID}
- OPEN_AI_KEY=${LULIS_OPEN_AI_KEY}
- ELEVEN_LABS_KEY=${LULIS_ELEVEN_LABS_KEY}
- ELEVEN_LABS_VOICE_ID=${LULIS_ELEVEN_LABS_VOICE_ID}
- REPLICATE_KEY=${LULIS_REPLICATE_KEY}
- AWS_BUCKET_NAME=${LULIS_AWS_BUCKET_NAME}
- AWS_REGION=${LULIS_AWS_REGION}
- AWS_ACCESS_KEY_ID=${LULIS_AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${LULIS_AWS_SECRET_ACCESS_KEY}
Considering the following environment variables that we have to configure first, you can already see the list of services that we will have to prepare first:
- Twitch Account (https://twitchapps.com/tmi/ use this for TWITCH_CLIENT_ID)
- OpenAI Developer Account https://platform.openai.com/login?launch
- Eleven Labs https://elevenlabs.io/ with a Cloned Voice for the ELEVEN_LABS_VOICE_ID
- Replicate.com https://replicate.com/
- AWS S3 https://aws.amazon.com/pm/serv-s3 or another blob storage like Cloud Storage from GCP
With these services and variables configured, we can go ahead and run:
$ docker-compose up
This should immediately start the stream on our Twitch channel configured, and you can jump straight into Twitch to ask a question.
Next Steps
At the moment, our system is still being prepared for parallel execution. It does start an RTMP transmission straight from one container, which means that if we were to bring this to multiple Pods in a K8s cluster, for example, we would have problems.
This system was deployed to Digital Ocean for demonstration, with one single container operating it. To make it more robust and allow for more speed and proper deployment, we will have to breakdown the RTMP part, the processing part, and the websocket communication part into separate systems; we could use, for example, NATS (https://nats.io/) / PubSub (GCP) / SQS (AWS) and so on. Any queue that is outside our instances. Since the FFMPEG needs to be indeed just one instance, we would likely keep that into one workload and be able to scale up and replace the processing part, so we can maybe put multiple videos to be processed in parallel and ready for display as they keep popping up in the chat.
Deployment
For the sample project, the deployment process uses GitHub Actions, which pushes an image to a Digital Ocean repository. From there, it gets deployed to an App.
The costs of this application so far have been meagre; every interaction with the system basically costs around 3 to 5 cents. Plus, there is a fixed cost of 20 dollars for a single instance on Digital Ocean.