Building with the OpenAI API: GPTs, Assistants, and Function Calling

Building with the OpenAI API: GPTs, Assistants, and Function Calling :hammer_and_wrench:

If you’re a developer and haven’t yet integrated the OpenAI API into a project, it’s probably just a matter of time. OpenAI’s ecosystem in 2026 is mature, well-documented, and offers specific tools for different types of applications: from simple chatbots to complex agents with access to external tools.

In this practical guide, we’re going to explore the three main pillars of OpenAI’s developer ecosystem: the Assistants API, Function Calling, and the GPT Builder. With real code examples and cost estimates so you can plan accordingly.


The OpenAI ecosystem for developers in 2026

Before diving into details, here’s a quick map:

  • Chat Completions API — The base API. You send messages, receive responses. Ideal for simple cases.
  • Assistants API — For building assistants with memory, files, and tools.
  • Function Calling / Tool Use — To give models access to external functions.
  • GPT Builder — No-code interface to create custom GPTs (ChatGPT Plus).
  • Responses API — New unified API that combines the best of Chat Completions and Assistants.

1. The Assistants API: build agents with memory

The Assistants API is the most powerful component for developers who want to build stateful applications — that is, applications that remember previous conversations and can access files.

Key concepts

  • Assistant — The agent with its configuration (model, instructions, tools)
  • Thread — A specific conversation (can last days or weeks)
  • Run — The execution of a response within a Thread
  • File — Documents that the assistant can process and search

Complete example: technical support assistant

from openai import OpenAI

client = OpenAI(api_key="your-api-key")

# 1. Create the assistant (done once)
assistant = client.beta.assistants.create(
    name="yoDEV Technical Support",
    instructions="""
    You are a support assistant for LatAm developers.
    Respond in English, with code examples when useful.
    If you don't know the answer, say you'll investigate.
    """,
    model="gpt-4o",
    tools=[
        {"type": "file_search"},  # search in documents
        {"type": "code_interpreter"}  # execute code
    ]
)

print(f"Assistant ID: {assistant.id}")
# Save this ID — you'll reuse it

# 2. Create a thread for a user
thread = client.beta.threads.create()

# 3. Add the user's message
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="How do I configure CORS in Express for production?"
)

# 4. Run the assistant
run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id,
    assistant_id=assistant.id
)

# 5. Get the response
if run.status == "completed":
    messages = client.beta.threads.messages.list(thread_id=thread.id)
    print(messages.data[0].content[0].text.value)

Key advantage: threads persist

Unlike simple Chat Completions, threads automatically maintain history. The next message the user sends to the same thread already includes the entire previous conversation. OpenAI manages memory for you.


2. Function Calling: connect the model to the real world

Function Calling (or “Tool Use”) is what transforms a language model into an agent that can execute real actions: query a database, call an external API, modify a file, send an email.

How the flow works

1. You → Define available functions (JSON Schema)
2. User → Asks a question
3. Model → Decides which function to call and with what parameters
4. Your code → Executes the function
5. You → Return the result to the model
6. Model → Generates the final response using the result

Example: assistant with database access

import json
from openai import OpenAI

client = OpenAI()

# Define available tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_users",
            "description": "Search users in the database by name or email",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search term (name or email)"
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum number of results",
                        "default": 10
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_statistics",
            "description": "Get system statistics",
            "parameters": {
                "type": "object",
                "properties": {
                    "period": {
                        "type": "string",
                        "enum": ["today", "week", "month"],
                        "description": "Period for the statistics"
                    }
                },
                "required": ["period"]
            }
        }
    }
]

# Your actual functions
def search_users(query: str, limit: int = 10):
    # Your actual DB logic goes here
    return [
        {"id": 1, "name": "MarĂ­a GarcĂ­a", "email": "maria@example.com"},
        {"id": 2, "name": "Carlos LĂłpez", "email": "carlos@example.com"}
    ]

def get_statistics(period: str):
    # Your actual logic goes here
    return {"active_users": 1250, "registrations_today": 45, "period": period}

# The agent in action
def run_agent(user_message: str):
    messages = [{"role": "user", "content": user_message}]
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools
    )
    
    # If the model wants to use a tool
    while response.choices[0].finish_reason == "tool_calls":
        tool_calls = response.choices[0].message.tool_calls
        messages.append(response.choices[0].message)
        
        for call in tool_calls:
            args = json.loads(call.function.arguments)
            
            # Execute the corresponding function
            if call.function.name == "search_users":
                result = search_users(**args)
            elif call.function.name == "get_statistics":
                result = get_statistics(**args)
            
            # Add the result to the context
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": json.dumps(result, ensure_ascii=False)
            })
        
        # Next call with the results
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=tools
        )
    
    return response.choices[0].message.content

# Usage
result = run_agent("How many users did we register this week?")
print(result)
# "This week you had 45 new registrations. Total active users are 1,250."
```---

## 3. Structured Outputs: responses always in the correct format

One of the most useful additions: **Structured Outputs** guarantee that the model always responds with JSON that follows your schema exactly. No more malformed responses.

```python
from pydantic import BaseModel
from typing import List

class CVExtraction(BaseModel):
    name: str
    email: str
    years_experience: int
    technologies: List[str]
    level: str  # junior, semi-senior, senior

# The model ALWAYS responds with this format
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract information from developer CVs"},
        {"role": "user", "content": cv_text}
    ],
    response_format=CVExtraction
)

candidate = response.choices[0].message.parsed
print(f"Name: {candidate.name}")
print(f"Technologies: {', '.join(candidate.technologies)}")
print(f"Level: {candidate.level}")

4. GPT Builder: the no-code path

If your goal is to create a personalized assistant without writing code, ChatGPT’s GPT Builder lets you configure everything from a visual interface:

  • Behavior instructions
  • Knowledge files (PDFs, documents)
  • Custom actions (connect your API)
  • Image and name of the GPT

The GPTs you create can be private (only you), shared via link, or published in the GPT Store.

When to use GPT Builder vs API: If you need to integrate the assistant into your own application, the API is the way. If you just want a personalized assistant within ChatGPT for your team, GPT Builder is much faster.


Pricing in 2026: real estimates for LatAm

The most efficient way to control costs is understanding the tokens ↔ price relationship. A quick reference:

Model Input (1M tokens) Output (1M tokens)
GPT-4o $2.50 $10.00
GPT-4o mini $0.15 $0.60
GPT-4.1 $2.00 $8.00
o3-mini $1.10 $4.40

How much is 1M tokens in practice?

1 token ≈ 0.75 words in English, ~0.6 words in Spanish

  • A normal chat conversation (10 turns): ~3,000-5,000 tokens
  • Analysis of a 10-page document: ~15,000 tokens
  • Medium-length generated response: ~500-1,000 tokens

Cost estimates by use case (GPT-4o):

Use case Approx. tokens Approx. cost
Chatbot (1 conversation) 2,000 tokens $0.002
Code analysis (medium function) 1,500 tokens $0.0015
Report generation (1 page) 3,000 tokens $0.003
Processing 100 emails 50,000 tokens $0.05

For most medium-sized applications, API costs are very manageable. If your app processes 10,000 conversations/month with GPT-4o mini, you’d be spending approximately $15-30 USD/month on API.

LatAm Tip: OpenAI accepts international cards. You can load credits starting at $5 USD to start experimenting. The Batch API offers 50% discount for asynchronous processing (ideal for non-real-time tasks).


Practical example: email data extractor

from openai import OpenAI
from pydantic import BaseModel
from typing import Optional

client = OpenAI()

class EmailData(BaseModel):
    sender: str
    subject: str
    action_required: Optional[str]
    urgency: str  # "high", "medium", "low"
    summary: str

def process_email(email_content: str) -> EmailData:
    response = client.beta.chat.completions.parse(
        model="gpt-4o-mini",  # cheaper for bulk processing
        messages=[
            {
                "role": "system",
                "content": "Extract structured information from technical support emails"
            },
            {
                "role": "user",
                "content": f"Email to process:\n\n{email_content}"
            }
        ],
        response_format=EmailData
    )
    return response.choices[0].message.parsed

# Usage
email = """
From: juan@startup.com
To: support@company.com
Subject: URGENT - Production down

Hi, we've been unable to access the production dashboard for 2 hours.
Users are reporting 500 errors. We need help now.
"""

data = process_email(email)
print(f"Urgency: {data.urgency}")
print(f"Action: {data.action_required}")
# Urgency: high
# Action: Check production dashboard logs and resolve 500 error

Tips to reduce costs without sacrificing quality

1. Use the right model for each task

  • GPT-4o mini for classifications, data extraction, simple summaries
  • GPT-4o for complex reasoning, code generation, deep analysis

2. Prompt caching
OpenAI caches repeated prompts (50% discount on input tokens for identical prompts at the start of context).

3. Batch API for non-urgent tasks

# Instead of 10,000 individual calls...
# Use the Batch API with 50% discount
client.batches.create(
    input_file_id="file-xxx",
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

4. Streaming for better UX
Doesn’t reduce costs, but makes responses reach users faster:

with client.chat.completions.stream(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Generate a README for this project..."}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Where to start?

If you’re just entering the OpenAI ecosystem:

  1. Create an account at platform.openai.com
  2. Load $5-10 USD of credits to experiment
  3. Start with the Playground — web interface to test models without code
  4. Your first app: a simple chatbot with Chat Completions
  5. When you need memory: migrate to Assistants API
  6. When you need actions: add Function Calling

The learning curve is smooth. In a weekend you can have your first functional assistant integrated into a web app.


What types of applications are you building with the OpenAI API in your projects? Do you have specific use cases for the LatAm market you’d like to share? The comments section is the perfect place to exchange ideas and learn together! :backhand_index_pointing_down:

Explore and share your code experiments in our Code Studio — the yoDEV community sandbox.