Java and the OpenAI API: Which Model Should You Choose?

When you start fiddling around with Microsoft's OpenAI API, you'll find that you've got plenty of models to choose from.

So which one do you go with?

For most folks it will probably come down to the same two things we see in software development all the time: pricing and requirements.

And I'll explain that as we jump into this subject further.

Keep in mind: this article is part of a series on getting up and running with Java and the OpenAI API. If you want to fiddle with the code, you might need to check out some of the earlier posts.

What the Heck Are Models, Anyway?

So what are these models we speak of so casually when discussing artificial intelligence and the looming demise of all humanity?

They're Generative Pre-Trained Transformers.

Does that answer your question? Probably not, so let's go deeper.

For starters, if you look at that big expression above and take not of the first letters in earch word, you might make out an acronym you've probably seen before: GPT.

And yep, that's exactly where the acronym GPT comes from.

GPTs are machine learning models used to handle natural language processing. They make it possible for you to type a question in English and get a response in English from a machine.

Now what's that "Pre-Trained" bit all about?

It means the models had to learn English. Similar to how you and I learned English.

And because they had to learn English, they were "trained."

How were they trained? With web content, books, articles, and even social media posts.

In fact, Elon Musk recently tightened the screws on Twitter X because some systems used tweets to train their language models.

Model Flavors

These models are like any other type of software you've worked with. They've got developers working on them to improve them over time.

So GPT began with GPT-1. Then came GPT-2. Then GPT-3. And GPT-4.

You will not be shocked to learn they each improved on the previous version.

As of this writing, GPT-4 is the latest version. Also as of this writing, I don't have access to GPT-4. :(

So I work with GPT-3, specifically, the 3.5 version.

By the time you're reading this, that might have changed. But that's where I'm at now.

Checking out the Models

So how can you see which models are available? Fortunately, OpenAI offers an endpoint for that.

Just fire up Postman and send a GET request to https://api.openai.com/v1/models

Make sure you set your token in the Authorization tab.

You'll get a response that looks like this:

{
    "object": "list",
    "data": [
        {
            "id": "text-davinci-001",
            "object": "model",
            "created": 1649364042,
            "owned_by": "openai",
            "permission": [
                {
                    "id": "modelperm-CDlahk1RbkghXDjtxqzXoPNo",
                    "object": "model_permission",
                    "created": 1690913868,
                    "allow_create_engine": false,
                    "allow_sampling": true,
                    "allow_logprobs": true,
                    "allow_search_indices": false,
                    "allow_view": true,
                    "allow_fine_tuning": false,
                    "organization": "*",
                    "group": null,
                    "is_blocking": false
                }
            ],
            "root": "text-davinci-001",
            "parent": null
        },
        {
            "id": "text-search-curie-query-001",
            "object": "model",
            "created": 1651172509,
            "owned_by": "openai-dev",
            "permission": [
                {
                    "id": "modelperm-fNgpMH6ZEQulSq1CjzlfQuIe",
                    "object": "model_permission",
                    "created": 1690864192,
                    "allow_create_engine": false,
                    "allow_sampling": true,
                    "allow_logprobs": true,
                    "allow_search_indices": true,
                    "allow_view": true,
                    "allow_fine_tuning": false,
                    "organization": "*",
                    "group": null,
                    "is_blocking": false
                }
            ],
            "root": "text-search-curie-query-001",
            "parent": null
        }
...
}

There will probably be quite a few models in there.

That's the good news. And the bad news.

You see, some of those models are deprecated. You can check the OpenAI deprecation page to see which ones you should avoid.

Also, as a rule of thumb, I'd stick to the GPT-* models as you begin learning OpenAI. They should suffice for now.

When examining that response above, though, pay particular attention to the "id" attribute. That's where you'll grab the model ID that you'll use in your completion request payloads.

For example:

{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "What is the best tool for API testing?"
        }
    ]
}

That "model" attribute matches one of the "id" attributes from the response above.

4k, 8k, 16k?

You might notice that some of the model names include a "k" segment, like 8k, or 16k.

So what's that all about?

It's about the maximum number of tokens you can use in the request.

So, for example, consider this model: gpt-3.5-turbo-16k. That means you include as much as 16k worth of tokens in your request.

But you will pay for that.

Although gpt-3.5-turbo-16k gives you four times the number of tokens (the default is 4k), you will pay twice as much per token.

This all gets back to what I said at the beginning: it comes down to pricing and requirements.

If you really need 16,384 tokens in your request, then by all means go with a 16k model.

I don't think most folks will need that, though.

What's That Date Thing?

You might also notice some models with names like this: gpt-3.5-turbo-0613.

That means that the model will be deprecated three months after a new version is released. It also won't receive any updates.

I see a lot of folks online using these models and so I followed suit. But I'm not doing that any more.

For me, as of this writing, it's gpt-3.5-turbo.

And What, Exactly, Is the Cost?

So what do these various models cost?

You'll want to check the OpenAI pricing page for the latest info. But here's what I'm looking at as of right now:

So if you do end up going with GPT-4, you're going to pay quite a bit more money.

Not Covered: Models for Embeddings

Keep in mind: the models I've covered in this article deal with chat completions. They're not the same models that you'll use for embeddings.

I'll cover those models later in the series.

Setting the Model Programmatically

Now let's take a look at how we can set the model within a Java application.

Recall that our ChatCompletionRequestHelper class includes the following line:

private static final String MODEL = "gpt-3.5-turbo";

You can see that constant in action in the code here:

    public static ChatCompletionRequest getRequest(final List<ChatMessage> messages, final Double temperature) {
        final ChatCompletionRequest chatCompletionRequest = ChatCompletionRequest
                .builder()
                .model(MODEL)
                .messages(messages)
                .temperature(temperature)
                .build();

        return chatCompletionRequest;
    }

Note the model() method.

Now, if you want to change the model, simply change MODEL constant to the appropriate model ID.

Then, run either of the existing tests. You'll get a response from the new model.

Wrapping It Up

And now you know more about those OpenAI models.

You also know how to set the OpenAI API chat completion model in your Java application.

So fiddle around. Try some different models and see what kinds of different responses you get.

Have fun!

Photo by cottonbro studio: https://www.pexels.com/photo/bionic-hand-and-human-hand-finger-pointing-6153354/