Geek Out Time: “Vibe Coding†on Google Colab with OpenAI & DeepSeek
(Also on Constellar tech blog https://medium.com/the-constellar-digital-technology-blog/geek-out-time-vibe-coding-on-google-colab-with-openai-deepseek-074805f64a28)
A new kind of coding, known as “vibe codingâ€, has emerged, a term popularized by Andrej Karpathy, former head of AI at Tesla and co-founder of OpenAI. This approach embraces the power of Large Language Models (LLMs) to accelerate coding by minimizing manual effort and relying on AI-generated completions. Instead of carefully crafting every line of code, developers guide AI tools with high-level instructions and iterate rapidly. With the advent of voice-powered AI interactions, coding becomes more intuitive, blurring the boundaries between ideation and execution.
Voice-powered coding is about flow. It enables developers to describe their intent naturally, relying on AI-assisted tools to translate high-level instructions into functioning code. Instead of laboriously writing every line, you explain what you need, and AI generates the implementation. The experience transforms coding into an interactive and iterative process, reducing friction and increasing efficiency. It is fun to try it out in the Geek Out Time.
The AI-Powered Evolution of Coding
The rise of AI coding assistants like GitHub Copilot, OpenAI’s GPT models, and DeepSeek has fundamentally changed software development. Instead of manually structuring every function, developers now collaborate with AI to suggest, refine, and even generate full applications. This shift is leading to a new paradigm — one where coding is more about problem-solving and less about syntax mastery.
Karpathy describes this shift as a move toward an “AI-native†development model. Engineers are increasingly adopting AI-driven workflows, acting more as orchestrators than traditional coders, guiding AI tools to build solutions efficiently. The goal is not just faster coding but an entirely new way to conceptualize and develop software.
Imagine building an application with minimal typing. You begin with a voice prompt: “Create a function that analyzes sentiment in customer reviews and visualizes it as a bar chart.†The AI generates Python code, including data preprocessing, sentiment analysis using NLP, and a visualization with Matplotlib. You refine it further by instructing: “Make the colors gradient from red to green based on sentiment intensity.†Within minutes, you have a working prototype.
This is the essence of voice-powered coding: rapid iteration, real-time AI assistance, and an intuitive workflow where development feels more like a creative process than a mechanical task.
Running Voice-Powered Coding on Google Colab with OpenAI
One of the most exciting applications of AI-assisted coding is voice-to-code systems. Instead of typing, you speak your ideas, and AI translates them into executable code. Google Colab provides an excellent environment for this, where voice recognition tools integrate seamlessly with AI models.
Step 1: Install Required Packages
Run the following command in a Colab cell to install the necessary libraries:
!pip install openai scipy numpy ipython
领英推è
Step 2: Set Up the AI-Powered Voice Coding System (OpenAI)
Now, copy and run the following full code in Google Colab:
import scipy.io.wavfile as wav
import numpy as np
import openai
import os
from openai import OpenAI
from IPython.display import Javascript
from google.colab import output
from base64 import b64decode
class VoiceCodeSystem:
def __init__(self, api_key):
self.client = OpenAI(api_key=api_key)
self.sample_rate = 44100
self.duration = 5 # Recording duration in seconds
def record_audio(self):
"""Record audio using the browser's microphone"""
print("Recording... Speak your code instructions (e.g., 'print hello')")
# JavaScript to record audio in the browser
record_js = """
const sleep = ms => new Promise(r => setTimeout(r, ms));
const record = (duration) => {
return new Promise(async resolve => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const mediaRecorder = new MediaRecorder(stream);
const chunks = [];
mediaRecorder.ondataavailable = e => chunks.push(e.data);
mediaRecorder.start();
await sleep(duration * 1000);
mediaRecorder.stop();
mediaRecorder.onstop = () => {
const blob = new Blob(chunks, { type: 'audio/wav' });
const reader = new FileReader();
reader.onloadend = () => resolve(reader.result.split(',')[1]);
reader.readAsDataURL(blob);
};
});
};
record(%d)
""" % self.duration
# Execute JavaScript and get the recorded audio
audio_b64 = output.eval_js(record_js)
audio_bytes = b64decode(audio_b64)
return audio_bytes
def save_audio(self, audio_bytes, filename="temp.wav"):
"""Save the recording to a WAV file"""
with open(filename, "wb") as f:
f.write(audio_bytes)
return filename
def transcribe_audio(self, audio_file):
"""Use OpenAI Whisper API to transcribe audio"""
try:
with open(audio_file, "rb") as file:
transcription = self.client.audio.transcriptions.create(
model="whisper-1",
file=file
)
return transcription.text
except Exception as e:
print(f"Error in transcription: {e}")
return None
def generate_code_with_gpt(self, text):
"""Use GPT to generate Python code from natural language"""
try:
prompt = f"""Convert the following natural language instruction into Python code.
Return only the Python code without any explanation.
Instruction: {text}
Python code:"""
response = self.client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a Python code generator. Generate only valid Python code without any explanation or markdown formatting."},
{"role": "user", "content": prompt}
],
temperature=0.7,
max_tokens=150
)
return response.choices[0].message.content.strip()
except Exception as e:
print(f"Error generating code: {e}")
return None
def run_session(self):
"""Run the voice coding session"""
variables = {}
temp_file = "temp.wav"
print("Voice Coding Session Started!")
print("\nSpeak your Python code instructions naturally.")
print("Examples:")
print("- 'Print hello'")
print("- 'Create a list of numbers from 1 to 5'")
print("- 'Calculate the square of 5'")
print("- Say 'exit' or 'stop' to end the session")
while True:
try:
# Record audio
audio_bytes = self.record_audio()
# Save to temporary file
self.save_audio(audio_bytes, temp_file)
# Transcribe audio
text = self.transcribe_audio(temp_file)
if text:
print("\nTranscribed text:", text)
if "exit" in text.lower() or "stop" in text.lower():
print("Ending voice coding session")
break
# Generate code using GPT
code = self.generate_code_with_gpt(text)
if code:
print("\nGenerated code:")
print(code)
print("\nOutput:")
try:
exec(code, globals(), variables)
print("\nCode executed successfully!")
except Exception as e:
print(f"Error executing code: {e}")
else:
print("Could not generate code from the instruction")
except Exception as e:
print(f"Error in session: {e}")
finally:
# Clean up temporary file
if os.path.exists(temp_file):
os.remove(temp_file)
def test_system(api_key):
"""Test the voice coding system with a simple instruction"""
system = VoiceCodeSystem(api_key)
# Test GPT code generation without voice
test_instruction = "print hello"
print("Testing code generation with:", test_instruction)
code = system.generate_code_with_gpt(test_instruction)
if code:
print("\nGenerated test code:")
print(code)
return True
return False
def start_voice_coding(api_key):
"""Initialize and start the voice coding system"""
print("Testing system first...")
if test_system(api_key):
print("\nSystem test successful! Starting voice coding session...\n")
system = VoiceCodeSystem(api_key)
system.run_session()
else:
print("System test failed. Please check your API key and try again.")
# Usage example
if __name__ == "__main__":
api_key = "sk-proj-xxxxxx" # Replace with your actual OpenAI API key
start_voice_coding(api_key)
Run it. I asked the simple program to print a hello and here is the result,
Alternative: Running Voice-Powered Coding on Google Colab with DeepSeek
For those looking for an alternative to OpenAI, DeepSeek provides a similar API for transcription and AI-powered code generation.
Step 1: Install Required Packages
Run this in a Colab cell:
!pip install requests scipy numpy ipython
Step 2: Set Up the AI-Powered Voice Coding System (DeepSeek)
# Install required packages
!pip install requests
!pip install scipy
!pip install numpy
!pip install ipython
import scipy.io.wavfile as wav
import numpy as np
import requests
import os
from IPython.display import Javascript
from google.colab import output
from base64 import b64decode
class VoiceCodeSystem:
def __init__(self, api_key):
self.api_key = api_key
self.sample_rate = 44100
self.duration = 5 # Recording duration in seconds
def record_audio(self):
"""Record audio using the browser's microphone"""
print("Recording... Speak your code instructions (e.g., 'print hello')")
# JavaScript to record audio in the browser
record_js = """
const sleep = ms => new Promise(r => setTimeout(r, ms));
const record = (duration) => {
return new Promise(async resolve => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const mediaRecorder = new MediaRecorder(stream);
const chunks = [];
mediaRecorder.ondataavailable = e => chunks.push(e.data);
mediaRecorder.start();
await sleep(duration * 1000);
mediaRecorder.stop();
mediaRecorder.onstop = () => {
const blob = new Blob(chunks, { type: 'audio/wav' });
const reader = new FileReader();
reader.onloadend = () => resolve(reader.result.split(',')[1]);
reader.readAsDataURL(blob);
};
});
};
record(%d)
""" % self.duration
# Execute JavaScript and get the recorded audio
audio_b64 = output.eval_js(record_js)
audio_bytes = b64decode(audio_b64)
return audio_bytes
def save_audio(self, audio_bytes, filename="temp.wav"):
"""Save the recording to a WAV file"""
with open(filename, "wb") as f:
f.write(audio_bytes)
return filename
def transcribe_audio(self, audio_file):
"""Use DeepSeek API to transcribe audio"""
try:
url = "https://api.deepseek.com/v1/transcribe" # Replace with DeepSeek's transcription endpoint
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "audio/wav"
}
with open(audio_file, "rb") as file:
response = requests.post(url, headers=headers, data=file)
if response.status_code == 200:
return response.json().get("text")
else:
print(f"Error in transcription: {response.status_code}, {response.text}")
return None
except Exception as e:
print(f"Error in transcription: {e}")
return None
def generate_code_with_deepseek(self, text):
"""Use DeepSeek API to generate Python code from natural language"""
try:
url = "https://api.deepseek.com/v1/generate-code" # Replace with DeepSeek's code generation endpoint
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
}
data = {
"prompt": f"Convert the following natural language instruction into Python code. Return only the Python code without any explanation.\n\nInstruction: {text}\n\nPython code:",
"model": "deepseek-code", # Replace with DeepSeek's code generation model
"temperature": 0.7,
"max_tokens": 150
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
return response.json().get("code")
else:
print(f"Error generating code: {response.status_code}, {response.text}")
return None
except Exception as e:
print(f"Error generating code: {e}")
return None
def run_session(self):
"""Run the voice coding session"""
variables = {}
temp_file = "temp.wav"
print("Voice Coding Session Started!")
print("\nSpeak your Python code instructions naturally.")
print("Examples:")
print("- 'Print hello'")
print("- 'Create a list of numbers from 1 to 5'")
print("- 'Calculate the square of 5'")
print("- Say 'exit' or 'stop' to end the session")
while True:
try:
# Record audio
audio_bytes = self.record_audio()
# Save to temporary file
self.save_audio(audio_bytes, temp_file)
# Transcribe audio
text = self.transcribe_audio(temp_file)
if text:
print("\nTranscribed text:", text)
if "exit" in text.lower() or "stop" in text.lower():
print("Ending voice coding session")
break
# Generate code using DeepSeek
code = self.generate_code_with_deepseek(text)
if code:
print("\nGenerated code:")
print(code)
print("\nOutput:")
try:
exec(code, globals(), variables)
print("\nCode executed successfully!")
except Exception as e:
print(f"Error executing code: {e}")
else:
print("Could not generate code from the instruction")
except Exception as e:
print(f"Error in session: {e}")
finally:
# Clean up temporary file
if os.path.exists(temp_file):
os.remove(temp_file)
def test_system(api_key):
"""Test the voice coding system with a simple instruction"""
system = VoiceCodeSystem(api_key)
# Test code generation without voice
test_instruction = "print hello"
print("Testing code generation with:", test_instruction)
code = system.generate_code_with_deepseek(test_instruction)
if code:
print("\nGenerated test code:")
print(code)
return True
return False
def start_voice_coding(api_key):
"""Initialize and start the voice coding system"""
print("Testing system first...")
if test_system(api_key):
print("\nSystem test successful! Starting voice coding session...\n")
system = VoiceCodeSystem(api_key)
system.run_session()
else:
print("System test failed. Please check your API key and try again.")
# Usage example
if __name__ == "__main__":
api_key = "your-deepseek-api-key" # Replace with your actual DeepSeek API key
start_voice_coding(api_key)
Both implementations provide an effective way to transform speech into functional Python code, allowing developers to choose the best fit for their workflow.
By integrating AI into the coding process, voice-powered programming is transforming how developers build software. Whether using OpenAI or DeepSeek, this approach enables a more fluid, creative, and efficient way to write code. While AI-powered coding offers significant benefits, it also introduces several concerns. One issue is over-reliance on AI, where developers might begin to accept AI-generated code without fully understanding it. Additionally, debugging AI-generated code can be more challenging, as developers may struggle to trace back unintended behaviors or inefficiencies. Will critical thinking and problem-solving skills erode if developers overly depend on AI for routine coding tasks rather than actively engaging with the logic and structure of their programs…. Nevertheless, we all have to learn, adopt, and evolve in one way or another….
Enjoy the experimenting and have fun !
I Help Businesses 10X Their Visibility & Generate Leads Organically | Growth & Content Strategy Expert | No Ads, Just Results!
3 周https://www.dhirubhai.net/posts/prasun-kumar-978617145_is-vibe-coding-the-future-of-software-development-activity-7298652187629342720-_bQd?utm_source=share&utm_medium=member_desktop&rcm=ACoAACMnHawBKTS3U2I8XwvMzK9JXqtdZfeP77E
RPA - Associate Lead at Ensemble Health Partners | A360 | Power Automate
3 周Hi Nedved Yang - Thank you for this article. I was able to successfully install and use the functionality. Could you please guide me on how to start the framework after a system restart? I assume the services stop upon reboot—what is the correct way to restart them? Appreciate your help.