🎯

openai-api

🎯Skill

from ovachiever/droid-tings

What it does

Enables comprehensive interaction with OpenAI's APIs, supporting chat completions, embeddings, images, audio, and moderation with advanced features like streaming and function calling.

📦

Part of

ovachiever/droid-tings(370 items)

openai-api

Installation

git cloneClone repository

git clone https://github.com/ovachiever/droid-tings.git

📖 Extracted from docs: ovachiever/droid-tings

Need more details? View full documentation on GitHub →

16Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

Overview

# OpenAI API - Complete Guide

Version: Production Ready ✅

Package: openai@6.7.0

Last Updated: 2025-10-25

---

Status

✅ Production Ready:

✅ Chat Completions API (GPT-5, GPT-4o, GPT-4 Turbo)
✅ Embeddings API (text-embedding-3-small, text-embedding-3-large)
✅ Images API (DALL-E 3 generation + GPT-Image-1 editing)
✅ Audio API (Whisper transcription + TTS with 11 voices)
✅ Moderation API (11 safety categories)
✅ Streaming patterns (SSE)
✅ Function calling / Tools
✅ Structured outputs (JSON schemas)
✅ Vision (GPT-4o)
✅ Both Node.js SDK and fetch approaches

---

[Quick Start](#quick-start)
[Chat Completions API](#chat-completions-api)
[GPT-5 Series Models](#gpt-5-series-models)
[Streaming Patterns](#streaming-patterns)
[Function Calling](#function-calling)
[Structured Outputs](#structured-outputs)
[Vision (GPT-4o)](#vision-gpt-4o)
[Embeddings API](#embeddings-api)
[Images API](#images-api)
[Audio API](#audio-api)
[Moderation API](#moderation-api)
[Error Handling](#error-handling)
[Rate Limits](#rate-limits)
[Production Best Practices](#production-best-practices)
[Relationship to openai-responses](#relationship-to-openai-responses)

---

Quick Start

Installation

```bash

npm install openai@6.7.0

```

Environment Setup

```bash

export OPENAI_API_KEY="sk-..."

```

Or create .env file:

```

OPENAI_API_KEY=sk-...

```

First Chat Completion (Node.js SDK)

```typescript

import OpenAI from 'openai';

const openai = new OpenAI({

apiKey: process.env.OPENAI_API_KEY,

});

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [

{ role: 'user', content: 'What are the three laws of robotics?' }

});

console.log(completion.choices[0].message.content);

```

First Chat Completion (Fetch - Cloudflare Workers)

```typescript

const response = await fetch('https://api.openai.com/v1/chat/completions', {

method: 'POST',

headers: {

'Authorization': Bearer ${env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'gpt-5',

messages: [

{ role: 'user', content: 'What are the three laws of robotics?' }

}),

});

const data = await response.json();

console.log(data.choices[0].message.content);

```

---

Chat Completions API

Endpoint: POST /v1/chat/completions

The Chat Completions API is the core interface for interacting with OpenAI's language models. It supports conversational AI, text generation, function calling, structured outputs, and vision capabilities.

Supported Models

#### GPT-5 Series (Released August 2025)

gpt-5: Full-featured reasoning model with advanced capabilities
gpt-5-mini: Cost-effective alternative with good performance
gpt-5-nano: Smallest/fastest variant for simple tasks

#### GPT-4o Series

gpt-4o: Multimodal model with vision capabilities
gpt-4-turbo: Fast GPT-4 variant

#### GPT-4 Series

gpt-4: Original GPT-4 model

Basic Request Structure

```typescript

{

model: string, // Model to use (e.g., "gpt-5")

messages: Message[], // Conversation history

reasoning_effort?: string, // GPT-5 only: "minimal" | "low" | "medium" | "high"

verbosity?: string, // GPT-5 only: "low" | "medium" | "high"

temperature?: number, // NOT supported by GPT-5

max_tokens?: number, // Max tokens to generate

stream?: boolean, // Enable streaming

tools?: Tool[], // Function calling tools

}

```

Response Structure

```typescript

{

id: string, // Unique completion ID

object: "chat.completion",

created: number, // Unix timestamp

model: string, // Model used

choices: [{

index: number,

message: {

role: "assistant",

content: string, // Generated text

tool_calls?: ToolCall[] // If function calling

finish_reason: string // "stop" | "length" | "tool_calls"

}],

usage: {

prompt_tokens: number,

completion_tokens: number,

total_tokens: number

}

```

Message Roles

OpenAI supports three message roles:

system (formerly "developer"): Set behavior and context
user: User input
assistant: Model responses

```typescript

const messages = [

{

role: 'system',

content: 'You are a helpful assistant that explains complex topics simply.'

{

role: 'user',

content: 'Explain quantum computing to a 10-year-old.'

}

];

```

Multi-turn Conversations

Build conversation history by appending messages:

```typescript

const messages = [

{ role: 'system', content: 'You are a helpful assistant.' },

{ role: 'user', content: 'What is TypeScript?' },

{ role: 'assistant', content: 'TypeScript is a superset of JavaScript...' },

{ role: 'user', content: 'How do I install it?' }

];

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: messages,

});

```

Important: Chat Completions API is stateless. You must send full conversation history with each request. For stateful conversations, use the openai-responses skill.

---

GPT-5 Series Models

GPT-5 models (released August 2025) introduce new parameters and capabilities:

Unique GPT-5 Parameters

#### reasoning_effort

Controls the depth of reasoning:

"minimal": Quick responses, less reasoning
"low": Basic reasoning
"medium": Balanced reasoning (default)
"high": Deep reasoning for complex problems

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [{ role: 'user', content: 'Solve this complex math problem...' }],

reasoning_effort: 'high', // Deep reasoning

});

```

#### verbosity

Controls output length and detail:

"low": Concise responses
"medium": Balanced detail (default)
"high": Verbose, detailed responses

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [{ role: 'user', content: 'Explain quantum mechanics' }],

verbosity: 'high', // Detailed explanation

});

```

GPT-5 Limitations

NOT Supported with GPT-5:

❌ temperature parameter
❌ top_p parameter
❌ logprobs parameter
❌ Chain of Thought (CoT) persistence between turns

If you need these features:

Use GPT-4o or GPT-4 Turbo for temperature/top_p/logprobs
Use openai-responses skill for stateful CoT preservation

GPT-5 vs GPT-4o Comparison

| Feature | GPT-5 | GPT-4o |

|---------|-------|--------|

| Reasoning control | ✅ reasoning_effort | ❌ |

| Verbosity control | ✅ verbosity | ❌ |

| Temperature | ❌ | ✅ |

| Top-p | ❌ | ✅ |

| Vision | ❌ | ✅ |

| Function calling | ✅ | ✅ |

| Streaming | ✅ | ✅ |

When to use GPT-5: Complex reasoning tasks, mathematical problems, logic puzzles, code generation

When to use GPT-4o: Vision tasks, when you need temperature control, multimodal inputs

---

Streaming Patterns

Streaming allows real-time token-by-token delivery, improving perceived latency for long responses.

Enable Streaming

Set stream: true:

```typescript

const stream = await openai.chat.completions.create({

model: 'gpt-5',

messages: [{ role: 'user', content: 'Tell me a story' }],

stream: true,

});

```

Streaming with Node.js SDK

```typescript

import OpenAI from 'openai';

const openai = new OpenAI();

const stream = await openai.chat.completions.create({

model: 'gpt-5',

messages: [{ role: 'user', content: 'Write a poem about coding' }],

stream: true,

});

for await (const chunk of stream) {

const content = chunk.choices[0]?.delta?.content || '';

process.stdout.write(content);

}

```

Streaming with Fetch (Cloudflare Workers)

```typescript

const response = await fetch('https://api.openai.com/v1/chat/completions', {

method: 'POST',

headers: {

'Authorization': Bearer ${env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'gpt-5',

messages: [{ role: 'user', content: 'Write a poem' }],

stream: true,

}),

});

const reader = response.body?.getReader();

const decoder = new TextDecoder();

while (true) {

const { done, value } = await reader!.read();

if (done) break;

const chunk = decoder.decode(value);

const lines = chunk.split('\n').filter(line => line.trim() !== '');

for (const line of lines) {

if (line.startsWith('data: ')) {

const data = line.slice(6);

if (data === '[DONE]') break;

try {

const json = JSON.parse(data);

const content = json.choices[0]?.delta?.content || '';

console.log(content);

} catch (e) {

// Skip invalid JSON

}

```

Server-Sent Events (SSE) Format

Streaming uses Server-Sent Events:

```

data: {"id":"chatcmpl-xyz","choices":[{"delta":{"role":"assistant"}}]}

data: {"id":"chatcmpl-xyz","choices":[{"delta":{"content":"Hello"}}]}

data: {"id":"chatcmpl-xyz","choices":[{"delta":{"content":" world"}}]}

data: {"id":"chatcmpl-xyz","choices":[{"finish_reason":"stop"}]}

data: [DONE]

```

Streaming Best Practices

✅ Always handle:

Incomplete chunks (buffer partial data)
[DONE] signal
Network errors and retries
Invalid JSON (skip gracefully)

✅ Performance:

Use streaming for responses >100 tokens
Don't stream if you need the full response before processing

❌ Don't:

Assume chunks are always complete JSON
Forget to close the stream on errors
Buffer entire response in memory (defeats streaming purpose)

---

Function Calling

Function calling (also called "tool calling") allows models to invoke external functions/tools based on conversation context.

Basic Tool Definition

```typescript

const tools = [

{

type: 'function',

function: {

description: 'Get the current weather for a location',

parameters: {

type: 'object',

properties: {

location: {

type: 'string',

description: 'City name, e.g., San Francisco'

unit: {

type: 'string',

enum: ['celsius', 'fahrenheit'],

description: 'Temperature unit'

}

required: ['location']

}

];

```

Making a Request with Tools

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [

{ role: 'user', content: 'What is the weather in San Francisco?' }

tools: tools,

});

```

Handling Tool Calls

```typescript

const message = completion.choices[0].message;

if (message.tool_calls) {

// Model wants to call a function

for (const toolCall of message.tool_calls) {

if (toolCall.function.name === 'get_weather') {

const args = JSON.parse(toolCall.function.arguments);

// Execute your function

const weatherData = await getWeather(args.location, args.unit);

// Send result back to model

const followUp = await openai.chat.completions.create({

model: 'gpt-5',

messages: [

...messages,

message, // Assistant's tool call

{

role: 'tool',

tool_call_id: toolCall.id,

content: JSON.stringify(weatherData)

}

tools: tools,

});

}

```

Complete Function Calling Flow

```typescript

async function chatWithTools(userMessage: string) {

let messages = [

{ role: 'user', content: userMessage }

];

while (true) {

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: messages,

tools: tools,

});

const message = completion.choices[0].message;

messages.push(message);

// If no tool calls, we're done

if (!message.tool_calls) {

return message.content;

}

// Execute all tool calls

for (const toolCall of message.tool_calls) {

const result = await executeFunction(toolCall.function.name, toolCall.function.arguments);

messages.push({

role: 'tool',

tool_call_id: toolCall.id,

content: JSON.stringify(result)

});

}

```

Multiple Tools

You can define multiple tools:

```typescript

const tools = [

{

type: 'function',

function: {

description: 'Get weather for a location',

parameters: { / schema / }

}

{

type: 'function',

function: {

description: 'Search the web',

parameters: { / schema / }

}

{

type: 'function',

function: {

description: 'Perform calculations',

parameters: { / schema / }

}

];

```

The model will choose which tool(s) to call based on the conversation.

---

Structured Outputs

Structured outputs allow you to enforce JSON schema validation on model responses.

Using JSON Schema

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-4o', // Note: Structured outputs best supported on GPT-4o

messages: [

{ role: 'user', content: 'Generate a person profile' }

response_format: {

type: 'json_schema',

json_schema: {

strict: true,

schema: {

type: 'object',

properties: {

age: { type: 'number' },

skills: {

type: 'array',

items: { type: 'string' }

}

required: ['name', 'age', 'skills'],

additionalProperties: false

}

});

const person = JSON.parse(completion.choices[0].message.content);

// { name: "Alice", age: 28, skills: ["TypeScript", "React"] }

```

JSON Mode (Simple)

For simpler use cases without strict schema validation:

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [

{ role: 'user', content: 'List 3 programming languages as JSON' }

response_format: { type: 'json_object' }

});

const data = JSON.parse(completion.choices[0].message.content);

```

Important: When using response_format, include "JSON" in your prompt to guide the model.

---

Vision (GPT-4o)

GPT-4o supports image understanding alongside text.

Image via URL

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-4o',

messages: [

{

role: 'user',

content: [

{ type: 'text', text: 'What is in this image?' },

{

type: 'image_url',

image_url: {

url: 'https://example.com/image.jpg'

}

]

}

]

});

```

Image via Base64

```typescript

import fs from 'fs';

const imageBuffer = fs.readFileSync('./image.jpg');

const base64Image = imageBuffer.toString('base64');

const completion = await openai.chat.completions.create({

model: 'gpt-4o',

messages: [

{

role: 'user',

content: [

{ type: 'text', text: 'Describe this image in detail' },

{

type: 'image_url',

image_url: {

url: data:image/jpeg;base64,${base64Image}

}

]

}

]

});

```

Multiple Images

```typescript

const completion = await openai.chat.completions.create({

model: 'gpt-4o',

messages: [

{

role: 'user',

content: [

{ type: 'text', text: 'Compare these two images' },

{ type: 'image_url', image_url: { url: 'https://example.com/image1.jpg' } },

{ type: 'image_url', image_url: { url: 'https://example.com/image2.jpg' } }

]

}

]

});

```

---

Embeddings API

Endpoint: POST /v1/embeddings

Embeddings convert text into high-dimensional vectors for semantic search, clustering, recommendations, and retrieval-augmented generation (RAG).

Supported Models

#### text-embedding-3-large

Default dimensions: 3072
Custom dimensions: 256-3072
Best for: Highest quality semantic understanding
Use case: Production RAG, advanced semantic search

#### text-embedding-3-small

Default dimensions: 1536
Custom dimensions: 256-1536
Best for: Cost-effective embeddings
Use case: Most applications, high-volume processing

#### text-embedding-ada-002 (Legacy)

Dimensions: 1536 (fixed)
Status: Still supported, use v3 models for new projects

Basic Request (Node.js SDK)

```typescript

import OpenAI from 'openai';

const openai = new OpenAI();

const embedding = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: 'The food was delicious and the waiter was friendly.',

});

console.log(embedding.data[0].embedding);

// [0.0023064255, -0.009327292, ..., -0.0028842222]

```

Basic Request (Fetch - Cloudflare Workers)

```typescript

const response = await fetch('https://api.openai.com/v1/embeddings', {

method: 'POST',

headers: {

'Authorization': Bearer ${env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'text-embedding-3-small',

input: 'The food was delicious and the waiter was friendly.',

}),

});

const data = await response.json();

const embedding = data.data[0].embedding;

```

Response Structure

```typescript

{

object: "list",

data: [

{

object: "embedding",

embedding: [0.0023064255, -0.009327292, ...], // Array of floats

index: 0

}

model: "text-embedding-3-small",

usage: {

prompt_tokens: 8,

total_tokens: 8

}

```

Custom Dimensions

Control embedding dimensions to reduce storage/processing:

```typescript

const embedding = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: 'Sample text',

dimensions: 256, // Reduced from 1536 default

});

```

Supported ranges:

text-embedding-3-large: 256-3072
text-embedding-3-small: 256-1536

Benefits:

Smaller storage (4x-12x reduction)
Faster similarity search
Lower memory usage
Minimal quality loss for many use cases

Batch Processing

Process multiple texts in a single request:

```typescript

const embeddings = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: [

'First document text',

'Second document text',

'Third document text',

});

// Access individual embeddings

embeddings.data.forEach((item, index) => {

console.log(Embedding ${index}:, item.embedding);

});

```

Limits:

Max tokens per input: 8192
Max summed tokens across all inputs: 300,000
Array dimension max: 2048

Dimension Reduction Pattern

Post-generation truncation (alternative to dimensions parameter):

```typescript

// Get full embedding

const response = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: 'Testing 123',

});

// Truncate to desired dimensions

const fullEmbedding = response.data[0].embedding;

const truncated = fullEmbedding.slice(0, 256);

// Normalize (L2)

function normalizeL2(vector: number[]): number[] {

const magnitude = Math.sqrt(vector.reduce((sum, val) => sum + val * val, 0));

return vector.map(val => val / magnitude);

}

const normalized = normalizeL2(truncated);

```

RAG Integration Pattern

Complete retrieval-augmented generation workflow:

```typescript

import OpenAI from 'openai';

const openai = new OpenAI();

// 1. Generate embeddings for knowledge base

async function embedKnowledgeBase(documents: string[]) {

const response = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: documents,

});

return response.data.map(item => item.embedding);

}

// 2. Embed user query

async function embedQuery(query: string) {

const response = await openai.embeddings.create({

model: 'text-embedding-3-small',

input: query,

});

return response.data[0].embedding;

}

// 3. Cosine similarity

function cosineSimilarity(a: number[], b: number[]): number {

const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);

const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));

const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));

return dotProduct / (magnitudeA * magnitudeB);

}

// 4. Find most similar documents

async function findSimilar(query: string, knowledgeBase: { text: string, embedding: number[] }[]) {

const queryEmbedding = await embedQuery(query);

const results = knowledgeBase.map(doc => ({

text: doc.text,

similarity: cosineSimilarity(queryEmbedding, doc.embedding),

}));

return results.sort((a, b) => b.similarity - a.similarity);

}

// 5. RAG: Retrieve + Generate

async function rag(query: string, knowledgeBase: { text: string, embedding: number[] }[]) {

const similarDocs = await findSimilar(query, knowledgeBase);

const context = similarDocs.slice(0, 3).map(d => d.text).join('\n\n');

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [

{

role: 'system',

content: Answer questions using the following context:\n\n${context}

{

role: 'user',

content: query

}

});

return completion.choices[0].message.content;

}

```

Embeddings Best Practices

✅ Model Selection:

Use text-embedding-3-small for most applications (1536 dims, cost-effective)
Use text-embedding-3-large for highest quality (3072 dims)

✅ Performance:

Batch embed up to 2048 documents per request
Use custom dimensions (256-512) for storage/speed optimization
Cache embeddings (they're deterministic for same input)

✅ Accuracy:

Normalize embeddings before storing (L2 normalization)
Use cosine similarity for comparison
Preprocess text consistently (lowercasing, removing special chars)

❌ Don't:

Exceed 8192 tokens per input (will error)
Sum >300k tokens across batch (will error)
Mix models (incompatible dimensions)
Forget to normalize when using truncated embeddings

---

Images API

OpenAI's Images API supports image generation with DALL-E 3 and image editing with GPT-Image-1.

Image Generation (DALL-E 3)

Endpoint: POST /v1/images/generations

Generate images from text prompts using DALL-E 3.

#### Basic Request (Node.js SDK)

```typescript

import OpenAI from 'openai';

const openai = new OpenAI();

const image = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A white siamese cat with striking blue eyes',

size: '1024x1024',

quality: 'standard',

style: 'vivid',

n: 1,

});

console.log(image.data[0].url);

console.log(image.data[0].revised_prompt);

```

#### Basic Request (Fetch - Cloudflare Workers)

```typescript

const response = await fetch('https://api.openai.com/v1/images/generations', {

method: 'POST',

headers: {

'Authorization': Bearer ${env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'dall-e-3',

prompt: 'A white siamese cat with striking blue eyes',

size: '1024x1024',

quality: 'standard',

style: 'vivid',

}),

});

const data = await response.json();

const imageUrl = data.data[0].url;

```

#### Parameters

size - Image dimensions:

"1024x1024" (square)
"1024x1536" (portrait)
"1536x1024" (landscape)
"1024x1792" (tall portrait)
"1792x1024" (wide landscape)

quality - Rendering quality:

"standard": Normal quality, faster, cheaper
"hd": High definition with finer details, costs more

style - Visual style:

"vivid": Hyper-real, dramatic, high-contrast images
"natural": More natural, less dramatic styling

response_format - Output format:

"url": Returns temporary URL (expires in 1 hour)
"b64_json": Returns base64-encoded image data

n - Number of images:

DALL-E 3 only supports n: 1
DALL-E 2 supports n: 1-10

#### Response Structure

```typescript

{

created: 1700000000,

data: [

{

url: "https://oaidalleapiprodscus.blob.core.windows.net/...",

revised_prompt: "A pristine white Siamese cat with striking blue eyes, sitting elegantly..."

}

]

}

```

Note: DALL-E 3 may revise your prompt for safety/quality. The revised_prompt field shows what was actually used.

#### Quality Comparison

```typescript

// Standard quality (faster, cheaper)

const standardImage = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A futuristic city at sunset',

quality: 'standard',

});

// HD quality (finer details, costs more)

const hdImage = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A futuristic city at sunset',

quality: 'hd',

});

```

#### Style Comparison

```typescript

// Vivid style (hyper-real, dramatic)

const vividImage = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A mountain landscape',

style: 'vivid',

});

// Natural style (more realistic, less dramatic)

const naturalImage = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A mountain landscape',

style: 'natural',

});

```

#### Base64 Output

```typescript

const image = await openai.images.generate({

model: 'dall-e-3',

prompt: 'A cyberpunk street scene',

response_format: 'b64_json',

});

const base64Data = image.data[0].b64_json;

// Convert to buffer and save

import fs from 'fs';

const buffer = Buffer.from(base64Data, 'base64');

fs.writeFileSync('image.png', buffer);

```

Image Editing (GPT-Image-1)

Endpoint: POST /v1/images/edits

Edit or composite images using AI.

Important: This endpoint uses multipart/form-data, not JSON.

#### Basic Edit Request

```typescript

import fs from 'fs';

import FormData from 'form-data';

const formData = new FormData();

formData.append('model', 'gpt-image-1');

formData.append('image', fs.createReadStream('./woman.jpg'));

formData.append('image_2', fs.createReadStream('./logo.png'));

formData.append('prompt', 'Add the logo to the woman\'s top, as if stamped into the fabric.');

formData.append('input_fidelity', 'high');

formData.append('size', '1024x1024');

formData.append('quality', 'auto');

const response = await fetch('https://api.openai.com/v1/images/edits', {

method: 'POST',

headers: {

'Authorization': Bearer ${process.env.OPENAI_API_KEY},

...formData.getHeaders(),

body: formData,

});

const data = await response.json();

const editedImageUrl = data.data[0].url;

```

#### Edit Parameters

model: "gpt-image-1" (required)

image: Primary image file (PNG, JPEG, WebP)

image_2: Secondary image for compositing (optional)

prompt: Text description of desired edits

input_fidelity:

"low": More creative freedom
"medium": Balance
"high": Stay closer to original

size: Same options as generation

quality:

"auto": Automatic quality selection
"standard": Normal quality
"high": Higher quality

format: Output format:

"png": PNG (supports transparency)
"jpeg": JPEG (no transparency)
"webp": WebP (smaller file size)

background: Background handling:

"transparent": Transparent background (PNG/WebP only)
"white": White background
"black": Black background

output_compression: JPEG/WebP compression (0-100)

0: Maximum compression (smallest file)
100: Minimum compression (highest quality)

#### Transparent Background Example

```typescript

const formData = new FormData();

formData.append('model', 'gpt-image-1');

formData.append('image', fs.createReadStream('./product.jpg'));

formData.append('prompt', 'Remove the background, keeping only the product.');

formData.append('format', 'png');

formData.append('background', 'transparent');

const response = await fetch('https://api.openai.com/v1/images/edits', {

method: 'POST',

headers: {

'Authorization': Bearer ${process.env.OPENAI_API_KEY},

...formData.getHeaders(),

body: formData,

});

```

Images Best Practices

✅ Prompting:

Be specific about details (colors, composition, style)
Include artistic style references ("oil painting", "photograph", "3D render")
Specify lighting ("golden hour", "studio lighting", "dramatic shadows")
DALL-E 3 may revise prompts; check revised_prompt

✅ Performance:

Use "standard" quality unless HD details are critical
Use "natural" style for realistic images
Use "vivid" style for marketing/artistic images
Cache generated images (they're non-deterministic)

✅ Cost Optimization:

Standard quality is cheaper than HD
Smaller sizes cost less
Use appropriate size for your use case (don't generate 1792x1024 if you need 512x512)

❌ Don't:

Request multiple images with DALL-E 3 (n=1 only)
Expect deterministic output (same prompt = different images)
Use URLs that expire (save images if needed long-term)
Forget to handle revised prompts (DALL-E 3 modifies for safety)

---

Audio API

OpenAI's Audio API provides speech-to-text (Whisper) and text-to-speech (TTS) capabilities.

Whisper Transcription

Endpoint: POST /v1/audio/transcriptions

Convert audio to text using Whisper.

#### Supported Audio Formats

mp3
mp4
mpeg
mpga
m4a
wav
webm

#### Basic Transcription (Node.js SDK)

```typescript

import OpenAI from 'openai';

import fs from 'fs';

const openai = new OpenAI();

const transcription = await openai.audio.transcriptions.create({

file: fs.createReadStream('./audio.mp3'),

model: 'whisper-1',

});

console.log(transcription.text);

```

#### Basic Transcription (Fetch)

```typescript

import fs from 'fs';

import FormData from 'form-data';

const formData = new FormData();

formData.append('file', fs.createReadStream('./audio.mp3'));

formData.append('model', 'whisper-1');

const response = await fetch('https://api.openai.com/v1/audio/transcriptions', {

method: 'POST',

headers: {

'Authorization': Bearer ${process.env.OPENAI_API_KEY},

...formData.getHeaders(),

body: formData,

});

const data = await response.json();

console.log(data.text);

```

#### Response Structure

```typescript

{

text: "Hello, this is a transcription of the audio file."

}

```

Text-to-Speech (TTS)

Endpoint: POST /v1/audio/speech

Convert text to natural-sounding speech.

#### Supported Models

tts-1

Standard quality
Optimized for real-time streaming
Lowest latency

tts-1-hd

High definition quality
Better audio fidelity
Slightly higher latency

gpt-4o-mini-tts

Latest model (November 2024)
Supports voice instructions
Best quality and control

#### Available Voices (11 total)

alloy: Neutral, balanced voice
ash: Clear, professional voice
ballad: Warm, storytelling voice
coral: Soft, friendly voice
echo: Calm, measured voice
fable: Expressive, narrative voice
onyx: Deep, authoritative voice
nova: Bright, energetic voice
sage: Wise, thoughtful voice
shimmer: Gentle, soothing voice
verse: Poetic, rhythmic voice

#### Basic TTS (Node.js SDK)

```typescript

import OpenAI from 'openai';

import fs from 'fs';

const openai = new OpenAI();

const mp3 = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'The quick brown fox jumped over the lazy dog.',

});

const buffer = Buffer.from(await mp3.arrayBuffer());

fs.writeFileSync('speech.mp3', buffer);

```

#### Basic TTS (Fetch)

```typescript

const response = await fetch('https://api.openai.com/v1/audio/speech', {

method: 'POST',

headers: {

'Authorization': Bearer ${process.env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'tts-1',

voice: 'alloy',

input: 'The quick brown fox jumped over the lazy dog.',

}),

});

const audioBuffer = await response.arrayBuffer();

// Save or stream the audio

```

#### TTS Parameters

input: Text to convert to speech (max 4096 characters)

voice: One of 11 voices (alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse)

model: "tts-1" | "tts-1-hd" | "gpt-4o-mini-tts"

instructions: Voice control instructions (gpt-4o-mini-tts only)

Not supported by tts-1 or tts-1-hd
Examples: "Speak in a calm, soothing tone", "Use a professional business voice"

response_format: Output audio format

"mp3" (default)
"opus"
"aac"
"flac"
"wav"
"pcm"

speed: Playback speed (0.25 to 4.0, default 1.0)

0.25 = quarter speed (very slow)
1.0 = normal speed
2.0 = double speed
4.0 = quadruple speed (very fast)

#### Voice Instructions (gpt-4o-mini-tts)

```typescript

const speech = await openai.audio.speech.create({

model: 'gpt-4o-mini-tts',

voice: 'nova',

input: 'Welcome to our customer support line.',

instructions: 'Speak in a calm, professional, and friendly tone suitable for customer service.',

});

```

Instruction Examples:

"Speak slowly and clearly for educational content"
"Use an enthusiastic, energetic tone for marketing"
"Adopt a calm, soothing voice for meditation guidance"
"Sound authoritative and confident for presentations"

#### Speed Control

```typescript

// Slow speech (0.5x speed)

const slowSpeech = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'This will be spoken slowly.',

speed: 0.5,

});

// Fast speech (1.5x speed)

const fastSpeech = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'This will be spoken quickly.',

speed: 1.5,

});

```

#### Different Audio Formats

```typescript

// MP3 (most compatible, default)

const mp3 = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'Hello',

response_format: 'mp3',

});

// Opus (best for web streaming)

const opus = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'Hello',

response_format: 'opus',

});

// WAV (uncompressed, highest quality)

const wav = await openai.audio.speech.create({

model: 'tts-1',

voice: 'alloy',

input: 'Hello',

response_format: 'wav',

});

```

#### Streaming TTS (Server-Sent Events)

```typescript

const response = await fetch('https://api.openai.com/v1/audio/speech', {

method: 'POST',

headers: {

'Authorization': Bearer ${process.env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'gpt-4o-mini-tts',

voice: 'nova',

input: 'Long text to be streamed as audio chunks...',

stream_format: 'sse', // Server-Sent Events

}),

});

// Stream audio chunks

const reader = response.body?.getReader();

while (true) {

const { done, value } = await reader!.read();

if (done) break;

// Process audio chunk

processAudioChunk(value);

}

```

Note: SSE streaming (stream_format: "sse") is only supported by gpt-4o-mini-tts. tts-1 and tts-1-hd do not support streaming.

Audio Best Practices

✅ Transcription:

Use supported formats (mp3, wav, m4a)
Ensure clear audio quality
Whisper handles multiple languages automatically
Works best with clean audio (minimal background noise)

✅ Text-to-Speech:

Use tts-1 for real-time/streaming (lowest latency)
Use tts-1-hd for higher quality offline audio
Use gpt-4o-mini-tts for voice instructions and streaming
Choose voice based on use case (alloy for neutral, onyx for authoritative, etc.)
Test different voices to find best fit
Use instructions (gpt-4o-mini-tts) for fine-grained control

✅ Performance:

Cache generated audio (deterministic for same input)
Use opus format for web streaming (smaller file size)
Use mp3 for maximum compatibility
Stream audio with stream_format: "sse" for real-time playback

❌ Don't:

Exceed 4096 characters for TTS input
Use instructions with tts-1 or tts-1-hd (not supported)
Use streaming with tts-1/tts-1-hd (use gpt-4o-mini-tts)
Assume transcription is perfect (always review important content)

---

Moderation API

Endpoint: POST /v1/moderations

Check content for policy violations across 11 safety categories.

Basic Moderation (Node.js SDK)

```typescript

import OpenAI from 'openai';

const openai = new OpenAI();

const moderation = await openai.moderations.create({

model: 'omni-moderation-latest',

input: 'I want to hurt someone.',

});

console.log(moderation.results[0].flagged);

console.log(moderation.results[0].categories);

console.log(moderation.results[0].category_scores);

```

Basic Moderation (Fetch)

```typescript

const response = await fetch('https://api.openai.com/v1/moderations', {

method: 'POST',

headers: {

'Authorization': Bearer ${env.OPENAI_API_KEY},

'Content-Type': 'application/json',

body: JSON.stringify({

model: 'omni-moderation-latest',

input: 'I want to hurt someone.',

}),

});

const data = await response.json();

const isFlagged = data.results[0].flagged;

```

Response Structure

```typescript

{

id: "modr-ABC123",

model: "omni-moderation-latest",

results: [

{

flagged: true,

categories: {

sexual: false,

hate: false,

harassment: true,

"self-harm": false,

"sexual/minors": false,

"hate/threatening": false,

"violence/graphic": false,

"self-harm/intent": false,

"self-harm/instructions": false,

"harassment/threatening": true,

violence: true

category_scores: {

sexual: 0.000011726,

hate: 0.2270666,

harassment: 0.5215635,

"self-harm": 0.0000123,

"sexual/minors": 0.0000001,

"hate/threatening": 0.0123456,

"violence/graphic": 0.0123456,

"self-harm/intent": 0.0000123,

"self-harm/instructions": 0.0000123,

"harassment/threatening": 0.4123456,

violence: 0.9971135

}

]

}

```

Safety Categories (11 total)

sexual: Sexual content

Erotic or pornographic material
Sexual services

hate: Hateful content

Content promoting hate based on identity
Dehumanizing language

harassment: Harassing content

Bullying or intimidation
Personal attacks

self-harm: Self-harm content

Promoting or encouraging self-harm
Suicide-related content

sexual/minors: Sexual content involving minors

Any sexualization of children
Child abuse material (CSAM)

hate/threatening: Hateful + threatening

Violent threats based on identity
Calls for violence against protected groups

violence/graphic: Graphic violence

Extreme gore or violence
Graphic injury descriptions

self-harm/intent: Self-harm intent

Active expressions of suicidal ideation
Plans to self-harm

self-harm/instructions: Self-harm instructions

How-to guides for self-harm
Methods for suicide

harassment/threatening: Harassment + threats

Violent threats toward individuals
Credible harm threats

violence: Violent content

Threats of violence
Glorification of violence
Instructions for violence

Category Scores

Scores range from 0 to 1:

0.0: Very low confidence
0.5: Medium confidence
1.0: Very high confidence

Recommended Thresholds

```typescript

const thresholds = {

sexual: 0.5,

hate: 0.4,

harassment: 0.5,

'self-harm': 0.3,

'sexual/minors': 0.1, // Lower threshold for child safety

'hate/threatening': 0.3,

'violence/graphic': 0.5,

'self-harm/intent': 0.2,

'self-harm/instructions': 0.2,

'harassment/threatening': 0.3,

violence: 0.5,

};

function isFlagged(result: ModerationResult): boolean {

return Object.entries(result.category_scores).some(

([category, score]) => score > thresholds[category]

);

}

```

Batch Moderation

Moderate multiple inputs in a single request:

```typescript

const moderation = await openai.moderations.create({

model: 'omni-moderation-latest',

input: [

'First text to moderate',

'Second text to moderate',

'Third text to moderate',

});

moderation.results.forEach((result, index) => {

console.log(Input ${index}: ${result.flagged ? 'FLAGGED' : 'OK'});

if (result.flagged) {

console.log('Categories:', Object.keys(result.categories).filter(

cat => result.categories[cat]

));

}

});

```

Filtering by Category

```typescript

async function moderateContent(text: string) {

const moderation = await openai.moderations.create({

model: 'omni-moderation-latest',

input: text,

});

const result = moderation.results[0];

// Check specific categories

if (result.categories['sexual/minors']) {

throw new Error('Content violates child safety policy');

}

if (result.categories.violence && result.category_scores.violence > 0.7) {

throw new Error('Content contains high-confidence violence');

}

if (result.categories['self-harm/intent']) {

// Flag for human review

await flagForReview(text, 'self-harm-intent');

}

return result.flagged;

}

```

Production Pattern

```typescript

async function moderateUserContent(userInput: string) {

try {

const moderation = await openai.moderations.create({

model: 'omni-moderation-latest',

input: userInput,

});

const result = moderation.results[0];

// Immediate block for severe categories

const severeCategories = [

'sexual/minors',

'self-harm/intent',

'hate/threatening',

'harassment/threatening',

];

for (const category of severeCategories) {

if (result.categories[category]) {

return {

allowed: false,

reason: Content flagged for: ${category},

severity: 'high',

};

}

// Custom threshold check

if (result.category_scores.violence > 0.8) {

return {

allowed: false,

reason: 'High-confidence violence detected',

severity: 'medium',

};

}

// Allow content

return {

allowed: true,

scores: result.category_scores,

};

} catch (error) {

console.error('Moderation error:', error);

// Fail closed: block on error

return {

allowed: false,

reason: 'Moderation service unavailable',

severity: 'error',

};

}

```

Moderation Best Practices

✅ Safety:

Always moderate user-generated content before storing/displaying
Use lower thresholds for child safety (sexual/minors)
Block immediately on severe categories
Log all flagged content for review

✅ User Experience:

Provide clear feedback when content is flagged
Allow users to edit and resubmit
Explain which policy was violated (without revealing detection details)
Implement appeals process for false positives

✅ Performance:

Batch moderate multiple inputs (up to array limit)
Cache moderation results for identical content
Moderate before expensive operations (AI generation, storage)
Use async moderation for non-critical flows

✅ Compliance:

Keep audit logs of all moderation decisions
Implement human review for borderline cases
Update thresholds based on your community standards
Comply with local content regulations

❌ Don't:

Skip moderation on "trusted" users (all UGC should be checked)
Rely solely on flagged boolean (check specific categories)
Ignore category scores (they provide nuance)
Use moderation as sole content policy enforcement (combine with human review)

---

Error Handling

Common HTTP Status Codes

200: Success
400: Bad Request (invalid parameters)
401: Unauthorized (invalid API key)
429: Rate Limit Exceeded
500: Server Error
503: Service Unavailable

Rate Limit Error (429)

```typescript

try {

const completion = await openai.chat.completions.create({ / ... / });

} catch (error) {

if (error.status === 429) {

// Rate limit exceeded - implement exponential backoff

console.error('Rate limit exceeded. Retry after delay.');

}

```

Invalid API Key (401)

```typescript

try {

const completion = await openai.chat.completions.create({ / ... / });

} catch (error) {

if (error.status === 401) {

console.error('Invalid API key. Check OPENAI_API_KEY environment variable.');

}

```

Exponential Backoff Pattern

```typescript

async function completionWithRetry(params, maxRetries = 3) {

for (let i = 0; i < maxRetries; i++) {

try {

return await openai.chat.completions.create(params);

} catch (error) {

if (error.status === 429 && i < maxRetries - 1) {

const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s

await new Promise(resolve => setTimeout(resolve, delay));

continue;

}

throw error;

}

```

---

Rate Limits

Understanding Rate Limits

OpenAI enforces rate limits based on:

RPM: Requests Per Minute
TPM: Tokens Per Minute
IPM: Images Per Minute (for DALL-E)

Limits vary by:

Usage tier (Free, Tier 1-5)
Model (GPT-5 has different limits than GPT-4)
Organization settings

Checking Rate Limit Headers

```typescript

const response = await fetch('https://api.openai.com/v1/chat/completions', {

method: 'POST',

headers: {

'Authorization': Bearer ${apiKey},

'Content-Type': 'application/json',

body: JSON.stringify({ / ... / }),

});

console.log(response.headers.get('x-ratelimit-limit-requests'));

console.log(response.headers.get('x-ratelimit-remaining-requests'));

console.log(response.headers.get('x-ratelimit-reset-requests'));

```

Best Practices

✅ Implement exponential backoff for 429 errors

✅ Monitor rate limit headers to avoid hitting limits

✅ Batch requests when possible (e.g., embeddings)

✅ Use appropriate models (don't use GPT-5 for simple tasks)

✅ Cache responses when appropriate

---

Production Best Practices

Security

✅ Never expose API keys in client-side code

```typescript

// ❌ Bad - API key in browser

const apiKey = 'sk-...'; // Visible to users!

// ✅ Good - Server-side proxy

// Client calls your backend, which calls OpenAI

```

✅ Use environment variables

```bash

export OPENAI_API_KEY="sk-..."

```

✅ Implement server-side proxy for browser apps

```typescript

// Your backend endpoint

app.post('/api/chat', async (req, res) => {

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: req.body.messages,

});

res.json(completion);

});

```

Performance

✅ Use streaming for long-form content (>100 tokens)

✅ Set appropriate max_tokens to control costs and latency

✅ Cache responses when queries are repeated

✅ Choose appropriate models:

GPT-5-nano for simple tasks
GPT-5 for complex reasoning
GPT-4o for vision tasks

Cost Optimization

✅ Select right model:

gpt-5-nano: Cheapest, fastest
gpt-5-mini: Balance of cost/quality
gpt-5: Best quality, most expensive

✅ Limit max_tokens:

```typescript

{

max_tokens: 500, // Don't generate more than needed

}

```

✅ Use caching:

```typescript

const cache = new Map();

async function getCachedCompletion(prompt) {

if (cache.has(prompt)) {

return cache.get(prompt);

}

const completion = await openai.chat.completions.create({

model: 'gpt-5',

messages: [{ role: 'user', content: prompt }],

});

cache.set(prompt, completion);

return completion;

}

```

Error Handling

✅ Wrap all API calls in try-catch

✅ Provide user-friendly error messages

✅ Log errors for debugging

✅ Implement retries for transient failures

```typescript

try {

const completion = await openai.chat.completions.create({ / ... / });

} catch (error) {

console.error('OpenAI API error:', error);

// User-friendly message

return {

error: 'Sorry, I encountered an issue. Please try again.',

};

}

```

---

Relationship to openai-responses

openai-api (This Skill)

Traditional/stateless API for:

✅ Simple chat completions
✅ Embeddings for RAG/search
✅ Images (DALL-E 3)
✅ Audio (Whisper/TTS)
✅ Content moderation
✅ One-off text generation
✅ Cloudflare Workers / edge deployment

Characteristics:

Stateless (you manage conversation history)
No built-in tools
Maximum flexibility
Works everywhere (Node.js, browsers, Workers, etc.)

openai-responses Skill

Stateful/agentic API for:

✅ Automatic conversation state management
✅ Preserved reasoning (Chain of Thought) across turns
✅ Built-in tools (Code Interpreter, File Search, Web Search, Image Generation)
✅ MCP server integration
✅ Background mode for long tasks
✅ Polymorphic outputs

Characteristics:

Stateful (OpenAI manages conversation)
Built-in tools included
Better for agentic workflows
Higher-level abstraction

When to Use Which?

| Use Case | Use openai-api | Use openai-responses |

|----------|----------------|---------------------|

| Simple chat | ✅ | ❌ |

| RAG/embeddings | ✅ | ❌ |

| Image generation | ✅ | ✅ |

| Audio processing | ✅ | ❌ |

| Agentic workflows | ❌ | ✅ |

| Multi-turn reasoning | ❌ | ✅ |

| Background tasks | ❌ | ✅ |

| Custom tools only | ✅ | ❌ |

| Built-in + custom tools | ❌ | ✅ |

Use both: Many apps use openai-api for embeddings/images/audio and openai-responses fo

More from this repository10

🎯

nextjs-shadcn-builder🎯Skill

nextjs-shadcn-builder skill from ovachiever/droid-tings

🎯

security-auditor🎯Skill

security-auditor skill from ovachiever/droid-tings

🎯

threejs-graphics-optimizer🎯Skill

threejs-graphics-optimizer skill from ovachiever/droid-tings

🎯

api-documenter🎯Skill

api-documenter skill from ovachiever/droid-tings

🎯

secret-scanner🎯Skill

secret-scanner skill from ovachiever/droid-tings

🎯

readme-updater🎯Skill

readme-updater skill from ovachiever/droid-tings

🎯

applying-brand-guidelines🎯Skill

applying-brand-guidelines skill from ovachiever/droid-tings

🎯

tailwind-v4-shadcn🎯Skill

Configures Tailwind v4 with shadcn/ui, automating CSS variable setup, dark mode, and preventing common initialization errors.

🎯

deep-reading-analyst🎯Skill

deep-reading-analyst skill from ovachiever/droid-tings

🎯

dependency-auditor🎯Skill

dependency-auditor skill from ovachiever/droid-tings

openai-api

Installation

Skill Details

Overview

Status

Table of Contents

Quick Start

Installation

Environment Setup

First Chat Completion (Node.js SDK)

First Chat Completion (Fetch - Cloudflare Workers)

Chat Completions API

Supported Models

Basic Request Structure

Response Structure

Message Roles

Multi-turn Conversations

GPT-5 Series Models

Unique GPT-5 Parameters

GPT-5 Limitations

GPT-5 vs GPT-4o Comparison

Streaming Patterns

Enable Streaming

Streaming with Node.js SDK

Streaming with Fetch (Cloudflare Workers)

Server-Sent Events (SSE) Format

Streaming Best Practices

Function Calling

Basic Tool Definition

Making a Request with Tools

Handling Tool Calls

Complete Function Calling Flow

Multiple Tools

Structured Outputs

Using JSON Schema

JSON Mode (Simple)

Vision (GPT-4o)

Image via URL

Image via Base64

Multiple Images

Embeddings API

Supported Models

Basic Request (Node.js SDK)

Basic Request (Fetch - Cloudflare Workers)

Response Structure

Custom Dimensions

Batch Processing

Dimension Reduction Pattern

RAG Integration Pattern

Embeddings Best Practices

Images API

Image Generation (DALL-E 3)

Image Editing (GPT-Image-1)

Images Best Practices

Audio API

Whisper Transcription

Text-to-Speech (TTS)

Audio Best Practices

Moderation API

Basic Moderation (Node.js SDK)

Basic Moderation (Fetch)

Response Structure

Safety Categories (11 total)

Category Scores

Recommended Thresholds

Batch Moderation

Filtering by Category

Production Pattern

Moderation Best Practices

Error Handling

Common HTTP Status Codes

Rate Limit Error (429)

Invalid API Key (401)

Exponential Backoff Pattern

Rate Limits

Understanding Rate Limits

Checking Rate Limit Headers

Best Practices

Production Best Practices

Security