Building with the OpenAI API: GPTs, Assistants, and Function Calling 
If youâre a developer and havenât yet integrated the OpenAI API into a project, itâs probably just a matter of time. OpenAIâs ecosystem in 2026 is mature, well-documented, and offers specific tools for different types of applications: from simple chatbots to complex agents with access to external tools.
In this practical guide, weâre going to explore the three main pillars of OpenAIâs developer ecosystem: the Assistants API, Function Calling, and the GPT Builder. With real code examples and cost estimates so you can plan accordingly.
The OpenAI ecosystem for developers in 2026
Before diving into details, hereâs a quick map:
- Chat Completions API â The base API. You send messages, receive responses. Ideal for simple cases.
- Assistants API â For building assistants with memory, files, and tools.
- Function Calling / Tool Use â To give models access to external functions.
- GPT Builder â No-code interface to create custom GPTs (ChatGPT Plus).
- Responses API â New unified API that combines the best of Chat Completions and Assistants.
1. The Assistants API: build agents with memory
The Assistants API is the most powerful component for developers who want to build stateful applications â that is, applications that remember previous conversations and can access files.
Key concepts
- Assistant â The agent with its configuration (model, instructions, tools)
- Thread â A specific conversation (can last days or weeks)
- Run â The execution of a response within a Thread
- File â Documents that the assistant can process and search
Complete example: technical support assistant
from openai import OpenAI
client = OpenAI(api_key="your-api-key")
# 1. Create the assistant (done once)
assistant = client.beta.assistants.create(
name="yoDEV Technical Support",
instructions="""
You are a support assistant for LatAm developers.
Respond in English, with code examples when useful.
If you don't know the answer, say you'll investigate.
""",
model="gpt-4o",
tools=[
{"type": "file_search"}, # search in documents
{"type": "code_interpreter"} # execute code
]
)
print(f"Assistant ID: {assistant.id}")
# Save this ID â you'll reuse it
# 2. Create a thread for a user
thread = client.beta.threads.create()
# 3. Add the user's message
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="How do I configure CORS in Express for production?"
)
# 4. Run the assistant
run = client.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id
)
# 5. Get the response
if run.status == "completed":
messages = client.beta.threads.messages.list(thread_id=thread.id)
print(messages.data[0].content[0].text.value)
Key advantage: threads persist
Unlike simple Chat Completions, threads automatically maintain history. The next message the user sends to the same thread already includes the entire previous conversation. OpenAI manages memory for you.
2. Function Calling: connect the model to the real world
Function Calling (or âTool Useâ) is what transforms a language model into an agent that can execute real actions: query a database, call an external API, modify a file, send an email.
How the flow works
1. You â Define available functions (JSON Schema)
2. User â Asks a question
3. Model â Decides which function to call and with what parameters
4. Your code â Executes the function
5. You â Return the result to the model
6. Model â Generates the final response using the result
Example: assistant with database access
import json
from openai import OpenAI
client = OpenAI()
# Define available tools
tools = [
{
"type": "function",
"function": {
"name": "search_users",
"description": "Search users in the database by name or email",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search term (name or email)"
},
"limit": {
"type": "integer",
"description": "Maximum number of results",
"default": 10
}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_statistics",
"description": "Get system statistics",
"parameters": {
"type": "object",
"properties": {
"period": {
"type": "string",
"enum": ["today", "week", "month"],
"description": "Period for the statistics"
}
},
"required": ["period"]
}
}
}
]
# Your actual functions
def search_users(query: str, limit: int = 10):
# Your actual DB logic goes here
return [
{"id": 1, "name": "MarĂa GarcĂa", "email": "maria@example.com"},
{"id": 2, "name": "Carlos LĂłpez", "email": "carlos@example.com"}
]
def get_statistics(period: str):
# Your actual logic goes here
return {"active_users": 1250, "registrations_today": 45, "period": period}
# The agent in action
def run_agent(user_message: str):
messages = [{"role": "user", "content": user_message}]
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
# If the model wants to use a tool
while response.choices[0].finish_reason == "tool_calls":
tool_calls = response.choices[0].message.tool_calls
messages.append(response.choices[0].message)
for call in tool_calls:
args = json.loads(call.function.arguments)
# Execute the corresponding function
if call.function.name == "search_users":
result = search_users(**args)
elif call.function.name == "get_statistics":
result = get_statistics(**args)
# Add the result to the context
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": json.dumps(result, ensure_ascii=False)
})
# Next call with the results
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
return response.choices[0].message.content
# Usage
result = run_agent("How many users did we register this week?")
print(result)
# "This week you had 45 new registrations. Total active users are 1,250."
```---
## 3. Structured Outputs: responses always in the correct format
One of the most useful additions: **Structured Outputs** guarantee that the model always responds with JSON that follows your schema exactly. No more malformed responses.
```python
from pydantic import BaseModel
from typing import List
class CVExtraction(BaseModel):
name: str
email: str
years_experience: int
technologies: List[str]
level: str # junior, semi-senior, senior
# The model ALWAYS responds with this format
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract information from developer CVs"},
{"role": "user", "content": cv_text}
],
response_format=CVExtraction
)
candidate = response.choices[0].message.parsed
print(f"Name: {candidate.name}")
print(f"Technologies: {', '.join(candidate.technologies)}")
print(f"Level: {candidate.level}")
4. GPT Builder: the no-code path
If your goal is to create a personalized assistant without writing code, ChatGPTâs GPT Builder lets you configure everything from a visual interface:
- Behavior instructions
- Knowledge files (PDFs, documents)
- Custom actions (connect your API)
- Image and name of the GPT
The GPTs you create can be private (only you), shared via link, or published in the GPT Store.
When to use GPT Builder vs API: If you need to integrate the assistant into your own application, the API is the way. If you just want a personalized assistant within ChatGPT for your team, GPT Builder is much faster.
Pricing in 2026: real estimates for LatAm
The most efficient way to control costs is understanding the tokens â price relationship. A quick reference:
| Model | Input (1M tokens) | Output (1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4o mini | $0.15 | $0.60 |
| GPT-4.1 | $2.00 | $8.00 |
| o3-mini | $1.10 | $4.40 |
How much is 1M tokens in practice?
1 token â 0.75 words in English, ~0.6 words in Spanish
- A normal chat conversation (10 turns): ~3,000-5,000 tokens
- Analysis of a 10-page document: ~15,000 tokens
- Medium-length generated response: ~500-1,000 tokens
Cost estimates by use case (GPT-4o):
| Use case | Approx. tokens | Approx. cost |
|---|---|---|
| Chatbot (1 conversation) | 2,000 tokens | $0.002 |
| Code analysis (medium function) | 1,500 tokens | $0.0015 |
| Report generation (1 page) | 3,000 tokens | $0.003 |
| Processing 100 emails | 50,000 tokens | $0.05 |
For most medium-sized applications, API costs are very manageable. If your app processes 10,000 conversations/month with GPT-4o mini, youâd be spending approximately $15-30 USD/month on API.
LatAm Tip: OpenAI accepts international cards. You can load credits starting at $5 USD to start experimenting. The Batch API offers 50% discount for asynchronous processing (ideal for non-real-time tasks).
Practical example: email data extractor
from openai import OpenAI
from pydantic import BaseModel
from typing import Optional
client = OpenAI()
class EmailData(BaseModel):
sender: str
subject: str
action_required: Optional[str]
urgency: str # "high", "medium", "low"
summary: str
def process_email(email_content: str) -> EmailData:
response = client.beta.chat.completions.parse(
model="gpt-4o-mini", # cheaper for bulk processing
messages=[
{
"role": "system",
"content": "Extract structured information from technical support emails"
},
{
"role": "user",
"content": f"Email to process:\n\n{email_content}"
}
],
response_format=EmailData
)
return response.choices[0].message.parsed
# Usage
email = """
From: juan@startup.com
To: support@company.com
Subject: URGENT - Production down
Hi, we've been unable to access the production dashboard for 2 hours.
Users are reporting 500 errors. We need help now.
"""
data = process_email(email)
print(f"Urgency: {data.urgency}")
print(f"Action: {data.action_required}")
# Urgency: high
# Action: Check production dashboard logs and resolve 500 error
Tips to reduce costs without sacrificing quality
1. Use the right model for each task
- GPT-4o mini for classifications, data extraction, simple summaries
- GPT-4o for complex reasoning, code generation, deep analysis
2. Prompt caching
OpenAI caches repeated prompts (50% discount on input tokens for identical prompts at the start of context).
3. Batch API for non-urgent tasks
# Instead of 10,000 individual calls...
# Use the Batch API with 50% discount
client.batches.create(
input_file_id="file-xxx",
endpoint="/v1/chat/completions",
completion_window="24h"
)
4. Streaming for better UX
Doesnât reduce costs, but makes responses reach users faster:
with client.chat.completions.stream(
model="gpt-4o",
messages=[{"role": "user", "content": "Generate a README for this project..."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Where to start?
If youâre just entering the OpenAI ecosystem:
- Create an account at platform.openai.com
- Load $5-10 USD of credits to experiment
- Start with the Playground â web interface to test models without code
- Your first app: a simple chatbot with Chat Completions
- When you need memory: migrate to Assistants API
- When you need actions: add Function Calling
The learning curve is smooth. In a weekend you can have your first functional assistant integrated into a web app.
What types of applications are you building with the OpenAI API in your projects? Do you have specific use cases for the LatAm market youâd like to share? The comments section is the perfect place to exchange ideas and learn together! ![]()
Explore and share your code experiments in our Code Studio â the yoDEV community sandbox.