How to setup the Self Operating Computer Framework to take control of a computer running Windows 11

How to setup the Self Operating Computer Framework to take control of a computer running Windows 11

Note: It would be wise to setup up a Windows virtual machine to avoid potentially damaging your main Windows system or accidentally deleting something as you let an AI take control of your computer. This guide does not show how to setup a virtual machine however, that process is very well documented. I recommend using VMware Workstation as your hypervisor.

You will learn the following by completing this guide from beginning to end:

  • Basic Git operations and commands
  • Navigating the Windows command line
  • Interacting with open source projects on GitHub
  • Reading source code of a Python project
  • Gaining exposure to Application Programming Interface (API) terminology
  • Using OpenAI's API
  • Generating API keys
  • Configuring a project to have AI capabilities

In this guide I am going to walk you through setting up the Self Operating Computer Framework. The Self Operating Computer Framework is an open source project developed to continually test the baseline of the capabilities of modern AI tools to take full control of a computer. It essentially allows us to give it instructions in plain English and then it will proceed to execute those actions on your computer. You should be able to ask it to buy you some new Nike shoes on Amazon or to pull up your favorite YouTuber's channel. If you would like to see this in action, I also made a video demonstrating it's capabilities.

Check it out below:


This guide focuses primarily on how to set this up in Windows, but please know that the Self Operating Computer Framework (SOCF) works in MacOS and in Linux as well. I will write separate guides for both of those platforms in the future.

Lets begin!


1. Create an Account with OpenAI Playground

SOCF makes use of OpenAI's GPT4 vision preview accessible through OpenAI's API. GPT4 vision is the functionality that allows ChatGPT to "see". An application programming interface (API for short) is a software interface that allows developers to use the capabilities of another software system in their own application. The API can be utilized to process images which is how SOCF is going to know how to control our computer. After each action SOCF will take a screenshot, use GPT4 vision through OpenAI's API to "see" that image and then it will attempt to take the action that you asked it to.

Navigate to OpenAI's API site using your favorite web browser. Here is a link to the site for your convenience:

Click signup and create your account.

After you create your account you are going to want to spend $5 dollars, I know that seems weird but let me explain. When I first attempted to setup the project it didn't work. I kept getting errors when attempting to run it and I was getting sad about it. After some careful research I discovered a forum that explained how people can get access to GPT4 vision preview.

You need access to the GPT4 vision preview model to get the project working.

You must purchase at least $5 dollars worth of API credits. Once the payment is processed you will unlock the ability to use GPT4 vision API preview. Awesome, you are now done with step 1 of this process.


I'm API Rich :)


2. Download and install Python3

SOCF is written in Python so we will need Python installed on our computer to run it. If you are not sure if you have Python installed on your machine then you probably don't, so go download and install it. As you go through the install, please make sure to check the box that says:

"Adds Python to your path"

This will create an environment variable that ensures PowerShell will recognize Python3 options and commands directly from in your command line prompt. Installing Python will also give you pip which is a Python software package manager we will use later to install SOCF. Python3 can be downloaded here:

Now lets download the SOCF.

3. Download the Self Operating Computer Framework project from GitHub

The project is hosted on GitHub so it can be manually downloaded or cloned using Git. When I got this working I used Git to clone the project. To do this I had to download and install Git for Windows.


git for Windows


Once Git was installed I used the following command to clone the project from it's GitHub repository (fancy name for folder the program files are stored in)

Git is a version control system, so it will track any changes that happen on the source project and can be used to update those changes on your local copy of the project. I mention this because I used the term clone and this can be thought of in the same sense as downloading a file(s). With cloning and downloading we are pulling those files from somewhere over a network down to our own local system using the power of computer networking. Cloning with Git will track version changes.


4. Clone the Self Operating Computer Framework on your computer

Boom, so now we need to navigate into the project's directory (fancy name for folder on your local computer). I recommend you open PowerShell and run it as administrator. PowerShell is a powerful command line scripting language created by Microsoft.

cd (change directory) is a command used to you guessed it.... change directories. I like to use the command dir frequently to see where I am in the file system and what contents are in different directories.

Be mindful of where you run the git clone command from, because that is where the self-operating-computer repository will be stored. To clone the project navigate to the self-operating-computer GitHub site, copy the URL (https://github.com/OthersideAI/self-operating-computer.git) and use it like so:

Click the copy button next to the link
PS C:\Users\Bob> git clone https://github.com/OthersideAI/self-operating-computer.git        

cd into the self-operating-computer directory.

PS C:\Users\Bob> cd .\self-operating-computer\        

List the contents of the directory.


PS C:\Users\Bob\self-operating-computer> dir

Directory: C:\Users\Bob\self-operating-computer

Mode ? ? ? ? ? ? ? ? LastWriteTime ? ? ? ? Length Name

---- ? ? ? ? ? ? ? ? ------------- ? ? ? ? ------ ----

d----- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? ? ?  operate
d----- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? ? ?  readme
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? ? 30 .example.env
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? 3471 .gitignore
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? 1087 LICENSE
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ? 6077 README.md
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ?  892 requirements.txt
-a---- ? ? ? ? 12/1/2023 ? 9:21 AM ? ? ? ? ?  322 setup.py        

Now we are going to setup a python virtual environment, as is outlined in the setup instructions in README.md. A python virtual environment is an isolated environment for an application and it's dependencies to run in and keep separate from other virtual environments.


PS C:\Users\Bob\self-operating-computer> python3 -m venv venv        

Use the dir command again to list the directory.

PS C:\Users\Bob\self-operating-computer> dir

Directory: C:\Users\Bob\self-operating-computer

Mode ? ? ? ? ? ? ? ? LastWriteTime ? ? ? ? Length Name

---- ? ? ? ? ? ? ? ? ------------- ? ? ? ? ------ ----

-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? ? 2089 activate
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? ? 1018 activate.bat
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ?  26199 Activate.ps1
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? ?  393 deactivate.bat
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? 108427 pip.exe
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? 108427 pip3.11.exe
-a---- ? ? ? ? 12/1/2023 ? 9:47 AM ? ? ? ? 108427 pip3.exe
-a---- ? ? ? ? 12/1/2023 ? 9:46 AM ? ? ? ? 270616 python.exe
-a---- ? ? ? ? 12/1/2023 ? 9:46 AM ? ? ? ? 259352 pythonw.exe        

We will now activate the virtual environment but first we must set our PowerShell execution policy using the command below. Keep in mind that by default PowerShell will not allow the execution of custom scripts. We can use this command to set the execution policy to allow us to execute scripts.

PS C:\Users\htb-student\self-operating-computer> Set-ExecutionPolicy -ExecutionPolicy Bypass -Scope Process        

Then we can use Activate.ps1 to activate the virtual environment

PS C:\Users\htb-student\self-operating-computer> venv\Scripts\Activate.ps1        

Notice the (venv) in your prompt? That means we are "in" the python virtual environment. Now we will install all the required dependencies using pip.

(venv) PS C:\Users\htb-student\self-operating-computer> pip install -r requirements.txt        
(venv) PS C:\Users\htb-student\self-operating-computer> pip install .        

This will take a minute or so to download and install all the dependencies.

Before continuing with our work in the command line we are going to go back to the OpenAI API website and create an API key.

5. Create an OpenAI API Key

API. In this case we are creating an API key so that SOCF can use the OpenAI API through our own account. Don't worry it is not expensive, in your testing you probably will not even spend $5. However, it is important to be mindful of API usage and billing because it can be expensive if you are using API's on a larger scale. Take for example apps like Uber. Uber uses Google Maps API as part of the process in connecting drivers with riders. I can only imagine how big of a bill Uber gets from Google on a regular basis for that scale of usage of Google Maps API.

Lets proceed with creating an API key. On the OpenAI API site there is an option on the left called API Keys (the one with the lock next to it). Click API keys and select Create new secret key

Creating a new secret key in OpenAI's API

Once you click Create secret key it will display and highlight your API key. Copy this and save it to a secure location. It is important to keep that key secret because it is what we will use to allow SOCF to use the OpenAI API. If a malicious person got hold of an API key they could run up a bill on your dime or even try committing crimes using your API key which if discovered could be attributed to your account.

Now that we have our API key created lets use it!

6. Use your OpenAI API Key and Test the Self Operating Computer Framework

Head back to your PowerShell prompt that has the virtual environment running, we are going to add the API key as an environment variable. This way we do not need to hard code our API key into the source code of the application. The code is written to reference the .env file.

Use the mv command to rename .example.env to .env.

(venv) PS C:\Users\htb-student\self-operating-computer> mv .example.env .env        

Now use your favorite text and/or code editor to open that file. Add your API key directly after the = sign.

Insert your OpenAI API key

Save that and type operate!

Operate will run the Self Operating Computer Framework.

You should be met with this cool retro tech looking neon green image

Launching the Self Operating Computer Framework


Then you will be prompted to tell it to do something.

Feel free to ask it to navigate to the LTN Labs YouTube channel and subscribe (shameless plug). I like having it try things in the operating system as well like creating files, checking system information and other IT related tasks.

Thank you for Reading!

I hope you enjoyed this walkthrough!

Feel free to reach out to me if you need any help getting this working. You can get in touch with me via any of the following:

You should connect with the company behind this project as well. HyperWrite is the name and they also have a Discord server:

HyperWrite Discord: https://discord.gg/zxCYFXvdCW


Keep learning!


Shoutout to the Creators of the Self Operating Computer Framework

I am simply an admirer of this project. I believe it has a lot of promise and I'm sure lots of work is going into it. Kudos to the following for creating this awesome project:



Mouad Guessous

Futur Ingénieur Informatique

1 年

Can I use chatgpt from my command line ?

回复
Udo Kiel

????Vom Arbeitswissenschaftler zum Wissenschaftskommunikator: Gemeinsam für eine sichtbarere Forschungswelt

1 年

Sounds like a fascinating project! Can't wait to see what you create. ??

Woodley B. Preucil, CFA

Senior Managing Director

1 年

Robert Theisen Very insightful. Thank you for sharing

Robert Theisen Thanks for Sharing ?? Wish you a merry Christmas and a happy new year ??

要查看或添加评论,请登录

Robert Theisen的更多文章

  • AI's Dot Com Era Moment

    AI's Dot Com Era Moment

    Welcome to the 7th edition of The Faithful Technologist. In this edition I will be sharing some significant…

    2 条评论
  • Donna's Pool (A sci-fi short story)

    Donna's Pool (A sci-fi short story)

    A Short Story imagining what may be possible in the near future Grandma Donna's Pool Donna woke up at 9 am as she does…

    1 条评论
  • The Beauty of The Beginner's Mind

    The Beauty of The Beginner's Mind

    Welcome to the 6th edition of The Faithful Technologist. In this edition I will be sharing some recent AI &…

    2 条评论
  • Hackers are Leading Thinkers in Tech

    Hackers are Leading Thinkers in Tech

    Welcome to the 5th edition of The Faithful Technologist. In this edition I will discuss a few recent breakthroughs in…

    2 条评论
  • Reflections from Hack Space Con 2023

    Reflections from Hack Space Con 2023

    In this special edition of The Faithful Technologist I'll be sharing my key takeaways from a conference I attended…

    6 条评论
  • Learners Should Use ChatGPT

    Learners Should Use ChatGPT

    Welcome to the 3rd edition of The Faithful Technologist. The first two editions of this newsletter have primarily been…

  • AI will Let you Dunk on NBA Players

    AI will Let you Dunk on NBA Players

    Welcome to the 2nd edition of The Faithful Technologist. In this addition we are covering a few key bits of information…

  • Microsoft Challenges Google & Bing May Be The Secret Weapon

    Microsoft Challenges Google & Bing May Be The Secret Weapon

    Welcome to The Faithful Technologist #1 Welcome to the first edition of The Faithful Technologist. I’ve been thinking…

  • Hack The Box NetMon Educational Walkthrough

    Hack The Box NetMon Educational Walkthrough

    What you will learn: Efficient learning practices using Hack The Box, Google search and other people Using ParrotOS…

  • Pandora HTB Machine Educational Walkthrough

    Pandora HTB Machine Educational Walkthrough

    Please note that as a learning experiment, some of the explanations in the writeup were written by ChatGPT. In those…

    11 条评论

社区洞察

其他会员也浏览了