Offline AI.

Fixxx · Monday at 1:53 AM

OpenAI introduced the long-awaited open language model called gpt-oss. Its main advantage is the ability to run it locally on your own computer, including Macs with Apple Silicon processors. The model is available in two versions: gpt-oss-20b and gpt-oss-120b. The first version is a "mid-range" model that can be run on high-end Macs if they have sufficient resources. The second is a "heavy" model that requires much more serious hardware. It’s expected that the smaller version will hallucinate more often (meaning it will invent facts) since it has a smaller training dataset. However, it operates faster and can realistically run on home computers. Even in its simplified form, gpt-oss is an interesting tool for anyone wanting to try running a large language model right on their laptop, but it’s important to understand that unlike the familiar ChatGPT, this model operates without an internet connection, and by default, it lacks many of the "features" of advanced chatbots. For example, it doesn't verify answers through search engines, which significantly increases the risk of errors. To run gpt-oss-20b, OpenAI recommends a minimum of 16 GB of RAM, but in reality, this is more of a lower limit that allows you to see how everything works. It’s no surprise that Apple recently stopped selling Macs with 8 GB of RAM - AI is increasingly becoming a daily task for computers.

Getting started is straightforward - you need to install the Ollama app, which allows you to manage the model.

You can download it from ollama.com/download. Then, open the Terminal and enter the following commands:

Code:

ollama pull gpt-oss:20b
ollama run gpt-oss:20b

The model will take up about 15 GB of disk space. After downloading, you can select in the Ollama interface.

You can enable "airplane mode" in the settings to ensure completely local operation - no internet is required.

From there, it’s simple: enter a query and watch the result. But keep in mind that the model uses all available resources and your Mac may noticeably slow down. For example, on a MacBook Air with an M4 chip and 16 GB of memory, a response to "hello" took over five minutes, while the query "who was the 13th president of the USA" took about 43 minutes. So if you plan to use the model seriously - 16 GB is, to put it mildly, insufficient. If you no longer need the model and want to free up disk space, use the command:

Code:

ollama rm gpt-oss:20b

Additional information you can find on the official Ollama website.

You can also try an alternative application for macOS - LM Studio.

Offline AI.

Fixxx