Download OpenAPI specification:Download
An OAI compatible exllamav2 API that's both lightweight and fast
This docs page is not meant to send requests! Please use a service like Postman or a frontend UI.
Generates a completion from a prompt.
If stream = true, this returns an SSE stream.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
Max Tokens (integer) or Max Tokens (null) (Max Tokens) Aliases: max_length | |
Min Tokens (integer) or Min Tokens (null) (Min Tokens) Aliases: min_length | |
Generate Window (integer) or Generate Window (null) (Generate Window) | |
Stop (string) or (Array of Stop (strings or integers)) or Stop (null) (Stop) Aliases: stop_sequence | |
Banned Strings (string) or Array of Banned Strings (strings) or Banned Strings (null) (Banned Strings) | |
Array of Banned Tokens (integers) or Banned Tokens (string) or Banned Tokens (null) (Banned Tokens) Aliases: custom_token_bans | |
Array of Allowed Tokens (integers) or Allowed Tokens (string) or Allowed Tokens (null) (Allowed Tokens) Aliases: allowed_token_ids | |
Token Healing (boolean) or Token Healing (null) (Token Healing) | |
Temperature (number) or Temperature (null) (Temperature) | |
Temperature Last (boolean) or Temperature Last (null) (Temperature Last) | |
Smoothing Factor (number) or Smoothing Factor (null) (Smoothing Factor) | |
Top K (integer) or Top K (null) (Top K) | |
Top P (number) or Top P (null) (Top P) | |
Top A (number) or Top A (null) (Top A) | |
Min P (number) or Min P (null) (Min P) | |
Tfs (number) or Tfs (null) (Tfs) | |
Typical (number) or Typical (null) (Typical) Aliases: typical_p | |
Skew (number) or Skew (null) (Skew) | |
Xtc Probability (number) or Xtc Probability (null) (Xtc Probability) | |
Xtc Threshold (number) or Xtc Threshold (null) (Xtc Threshold) | |
Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty) | |
Presence Penalty (number) or Presence Penalty (null) (Presence Penalty) | |
Repetition Penalty (number) or Repetition Penalty (null) (Repetition Penalty) Aliases: rep_pen | |
Penalty Range (integer) or Penalty Range (null) (Penalty Range) Aliases: repetition_range, repetition_penalty_range, rep_pen_range | |
Repetition Decay (integer) or Repetition Decay (null) (Repetition Decay) | |
Dry Multiplier (number) or Dry Multiplier (null) (Dry Multiplier) | |
Dry Base (number) or Dry Base (null) (Dry Base) | |
Dry Allowed Length (integer) or Dry Allowed Length (null) (Dry Allowed Length) | |
Dry Range (integer) or Dry Range (null) (Dry Range) Aliases: dry_penalty_last_n | |
Dry Sequence Breakers (string) or Array of Dry Sequence Breakers (strings) or Dry Sequence Breakers (null) (Dry Sequence Breakers) | |
Mirostat (boolean) or Mirostat (null) (Mirostat) Default: false | |
Mirostat Mode (integer) or Mirostat Mode (null) (Mirostat Mode) | |
Mirostat Tau (number) or Mirostat Tau (null) (Mirostat Tau) | |
Mirostat Eta (number) or Mirostat Eta (null) (Mirostat Eta) | |
Add Bos Token (boolean) or Add Bos Token (null) (Add Bos Token) | |
Ban Eos Token (boolean) or Ban Eos Token (null) (Ban Eos Token) Aliases: ignore_eos | |
Skip Special Tokens (boolean) or Skip Special Tokens (null) (Skip Special Tokens) | |
Logit Bias (object) or Logit Bias (null) (Logit Bias) | |
Negative Prompt (string) or Negative Prompt (null) (Negative Prompt) | |
Json Schema (any) or Json Schema (null) (Json Schema) | |
Regex Pattern (string) or Regex Pattern (null) (Regex Pattern) | |
Grammar String (string) or Grammar String (null) (Grammar String) | |
Speculative Ngram (boolean) or Speculative Ngram (null) (Speculative Ngram) | |
Cfg Scale (number) or Cfg Scale (null) (Cfg Scale) Aliases: guidance_scale | |
Max Temp (number) or Max Temp (null) (Max Temp) Aliases: dynatemp_high | |
Min Temp (number) or Min Temp (null) (Min Temp) Aliases: dynatemp_low | |
Temp Exponent (number) or Temp Exponent (null) (Temp Exponent) | |
Model (string) or Model (null) (Model) | |
Stream (boolean) or Stream (null) (Stream) Default: false | |
ChatCompletionStreamOptions (object) or null | |
Logprobs (integer) or Logprobs (null) (Logprobs) | |
CompletionResponseFormat (object) or null | |
N (integer) or N (null) (N) | |
Best Of (integer) or Best Of (null) (Best Of) Not parsed. Only used for OAI compliance. | |
Echo (boolean) or Echo (null) (Echo) Default: false Not parsed. Only used for OAI compliance. | |
Suffix (string) or Suffix (null) (Suffix) Not parsed. Only used for OAI compliance. | |
User (string) or User (null) (User) Not parsed. Only used for OAI compliance. | |
required | Prompt (string) or Array of Prompt (strings) (Prompt) |
{- "max_tokens": 150,
- "min_tokens": 0,
- "generate_window": 512,
- "stop": "string",
- "banned_strings": "string",
- "banned_tokens": [
- 128,
- 330
], - "allowed_tokens": [
- 128,
- 330
], - "token_healing": true,
- "temperature": 1,
- "temperature_last": true,
- "smoothing_factor": 0,
- "top_k": -1,
- "top_p": 1,
- "top_a": 0,
- "min_p": 0,
- "tfs": 1,
- "typical": 1,
- "skew": 0,
- "xtc_probability": 0,
- "xtc_threshold": 0,
- "frequency_penalty": 0,
- "presence_penalty": 0,
- "repetition_penalty": 1,
- "penalty_range": 0,
- "repetition_decay": 0,
- "dry_multiplier": 0,
- "dry_base": 0,
- "dry_allowed_length": 0,
- "dry_range": 0,
- "dry_sequence_breakers": "string",
- "mirostat": false,
- "mirostat_mode": 0,
- "mirostat_tau": 1.5,
- "mirostat_eta": 0.3,
- "add_bos_token": true,
- "ban_eos_token": false,
- "skip_special_tokens": true,
- "logit_bias": {
- "1": 10,
- "2": 50
}, - "negative_prompt": "string",
- "json_schema": { },
- "regex_pattern": "string",
- "grammar_string": "string",
- "speculative_ngram": true,
- "cfg_scale": 1,
- "max_temp": 1,
- "min_temp": 1,
- "temp_exponent": 1,
- "model": "string",
- "stream": false,
- "stream_options": {
- "include_usage": false
}, - "logprobs": 0,
- "response_format": {
- "type": "text"
}, - "n": 1,
- "best_of": 0,
- "echo": false,
- "suffix": "string",
- "user": "string",
- "prompt": "string"
}
{- "id": "string",
- "choices": [
- {
- "index": 0,
- "finish_reason": "string",
- "logprobs": {
- "text_offset": [
- 0
], - "token_logprobs": [
- 0
], - "tokens": [
- "string"
], - "top_logprobs": [
- {
- "property1": 0,
- "property2": 0
}
]
}, - "text": "string"
}
], - "created": 0,
- "model": "string",
- "object": "text_completion",
- "usage": {
- "prompt_tokens": 0,
- "completion_tokens": 0,
- "total_tokens": 0
}
}
Generates a chat completion from a prompt.
If stream = true, this returns an SSE stream.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
Max Tokens (integer) or Max Tokens (null) (Max Tokens) Aliases: max_length | |
Min Tokens (integer) or Min Tokens (null) (Min Tokens) Aliases: min_length | |
Generate Window (integer) or Generate Window (null) (Generate Window) | |
Stop (string) or (Array of Stop (strings or integers)) or Stop (null) (Stop) Aliases: stop_sequence | |
Banned Strings (string) or Array of Banned Strings (strings) or Banned Strings (null) (Banned Strings) | |
Array of Banned Tokens (integers) or Banned Tokens (string) or Banned Tokens (null) (Banned Tokens) Aliases: custom_token_bans | |
Array of Allowed Tokens (integers) or Allowed Tokens (string) or Allowed Tokens (null) (Allowed Tokens) Aliases: allowed_token_ids | |
Token Healing (boolean) or Token Healing (null) (Token Healing) | |
Temperature (number) or Temperature (null) (Temperature) | |
Temperature Last (boolean) or Temperature Last (null) (Temperature Last) | |
Smoothing Factor (number) or Smoothing Factor (null) (Smoothing Factor) | |
Top K (integer) or Top K (null) (Top K) | |
Top P (number) or Top P (null) (Top P) | |
Top A (number) or Top A (null) (Top A) | |
Min P (number) or Min P (null) (Min P) | |
Tfs (number) or Tfs (null) (Tfs) | |
Typical (number) or Typical (null) (Typical) Aliases: typical_p | |
Skew (number) or Skew (null) (Skew) | |
Xtc Probability (number) or Xtc Probability (null) (Xtc Probability) | |
Xtc Threshold (number) or Xtc Threshold (null) (Xtc Threshold) | |
Frequency Penalty (number) or Frequency Penalty (null) (Frequency Penalty) | |
Presence Penalty (number) or Presence Penalty (null) (Presence Penalty) | |
Repetition Penalty (number) or Repetition Penalty (null) (Repetition Penalty) Aliases: rep_pen | |
Penalty Range (integer) or Penalty Range (null) (Penalty Range) Aliases: repetition_range, repetition_penalty_range, rep_pen_range | |
Repetition Decay (integer) or Repetition Decay (null) (Repetition Decay) | |
Dry Multiplier (number) or Dry Multiplier (null) (Dry Multiplier) | |
Dry Base (number) or Dry Base (null) (Dry Base) | |
Dry Allowed Length (integer) or Dry Allowed Length (null) (Dry Allowed Length) | |
Dry Range (integer) or Dry Range (null) (Dry Range) Aliases: dry_penalty_last_n | |
Dry Sequence Breakers (string) or Array of Dry Sequence Breakers (strings) or Dry Sequence Breakers (null) (Dry Sequence Breakers) | |
Mirostat (boolean) or Mirostat (null) (Mirostat) Default: false | |
Mirostat Mode (integer) or Mirostat Mode (null) (Mirostat Mode) | |
Mirostat Tau (number) or Mirostat Tau (null) (Mirostat Tau) | |
Mirostat Eta (number) or Mirostat Eta (null) (Mirostat Eta) | |
Add Bos Token (boolean) or Add Bos Token (null) (Add Bos Token) | |
Ban Eos Token (boolean) or Ban Eos Token (null) (Ban Eos Token) Aliases: ignore_eos | |
Skip Special Tokens (boolean) or Skip Special Tokens (null) (Skip Special Tokens) | |
Logit Bias (object) or Logit Bias (null) (Logit Bias) | |
Negative Prompt (string) or Negative Prompt (null) (Negative Prompt) | |
Json Schema (any) or Json Schema (null) (Json Schema) | |
Regex Pattern (string) or Regex Pattern (null) (Regex Pattern) | |
Grammar String (string) or Grammar String (null) (Grammar String) | |
Speculative Ngram (boolean) or Speculative Ngram (null) (Speculative Ngram) | |
Cfg Scale (number) or Cfg Scale (null) (Cfg Scale) Aliases: guidance_scale | |
Max Temp (number) or Max Temp (null) (Max Temp) Aliases: dynatemp_high | |
Min Temp (number) or Min Temp (null) (Min Temp) Aliases: dynatemp_low | |
Temp Exponent (number) or Temp Exponent (null) (Temp Exponent) | |
Model (string) or Model (null) (Model) | |
Stream (boolean) or Stream (null) (Stream) Default: false | |
ChatCompletionStreamOptions (object) or null | |
Logprobs (integer) or Logprobs (null) (Logprobs) | |
CompletionResponseFormat (object) or null | |
N (integer) or N (null) (N) | |
Best Of (integer) or Best Of (null) (Best Of) Not parsed. Only used for OAI compliance. | |
Echo (boolean) or Echo (null) (Echo) Default: false Not parsed. Only used for OAI compliance. | |
Suffix (string) or Suffix (null) (Suffix) Not parsed. Only used for OAI compliance. | |
User (string) or User (null) (User) Not parsed. Only used for OAI compliance. | |
required | Messages (string) or Array of Messages (objects) (Messages) |
Prompt Template (string) or Prompt Template (null) (Prompt Template) | |
Add Generation Prompt (boolean) or Add Generation Prompt (null) (Add Generation Prompt) Default: true | |
Template Vars (object) or Template Vars (null) (Template Vars) Default: {} | |
Response Prefix (string) or Response Prefix (null) (Response Prefix) | |
Array of Tools (objects) or Tools (null) (Tools) | |
Array of Functions (objects) or Functions (null) (Functions) |
{- "max_tokens": 150,
- "min_tokens": 0,
- "generate_window": 512,
- "stop": "string",
- "banned_strings": "string",
- "banned_tokens": [
- 128,
- 330
], - "allowed_tokens": [
- 128,
- 330
], - "token_healing": true,
- "temperature": 1,
- "temperature_last": true,
- "smoothing_factor": 0,
- "top_k": -1,
- "top_p": 1,
- "top_a": 0,
- "min_p": 0,
- "tfs": 1,
- "typical": 1,
- "skew": 0,
- "xtc_probability": 0,
- "xtc_threshold": 0,
- "frequency_penalty": 0,
- "presence_penalty": 0,
- "repetition_penalty": 1,
- "penalty_range": 0,
- "repetition_decay": 0,
- "dry_multiplier": 0,
- "dry_base": 0,
- "dry_allowed_length": 0,
- "dry_range": 0,
- "dry_sequence_breakers": "string",
- "mirostat": false,
- "mirostat_mode": 0,
- "mirostat_tau": 1.5,
- "mirostat_eta": 0.3,
- "add_bos_token": true,
- "ban_eos_token": false,
- "skip_special_tokens": true,
- "logit_bias": {
- "1": 10,
- "2": 50
}, - "negative_prompt": "string",
- "json_schema": { },
- "regex_pattern": "string",
- "grammar_string": "string",
- "speculative_ngram": true,
- "cfg_scale": 1,
- "max_temp": 1,
- "min_temp": 1,
- "temp_exponent": 1,
- "model": "string",
- "stream": false,
- "stream_options": {
- "include_usage": false
}, - "logprobs": 0,
- "response_format": {
- "type": "text"
}, - "n": 1,
- "best_of": 0,
- "echo": false,
- "suffix": "string",
- "user": "string",
- "messages": "string",
- "prompt_template": "string",
- "add_generation_prompt": true,
- "template_vars": { },
- "response_prefix": "string",
- "tools": [
- {
- "function": {
- "name": "string",
- "description": "string",
- "parameters": { }
}, - "type": "function"
}
], - "functions": [
- { }
]
}
{- "id": "string",
- "choices": [
- {
- "index": 0,
- "finish_reason": "string",
- "stop_str": "string",
- "message": {
- "role": "string",
- "content": "string",
- "tool_calls": [
- {
- "id": "string",
- "function": {
- "name": "string",
- "arguments": "string"
}, - "type": "function"
}
]
}, - "logprobs": {
- "content": [
- {
- "token": "string",
- "logprob": 0,
- "top_logprobs": [
- { }
]
}
]
}
}
], - "created": 0,
- "model": "string",
- "object": "chat.completion",
- "usage": {
- "prompt_tokens": 0,
- "completion_tokens": 0,
- "total_tokens": 0
}
}
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
required | Input (string) or Array of Input (strings) (Input) List of input texts to generate embeddings for. |
encoding_format | string (Encoding Format) Default: "float" Encoding format for the embeddings. Can be 'float' or 'base64'. |
Model (string) or Model (null) (Model) Name of the embedding model to use. If not provided, the default model will be used. |
{- "input": "string",
- "encoding_format": "float",
- "model": "string"
}
{- "object": "list",
- "data": [
- {
- "object": "embedding",
- "embedding": [
- 0
], - "index": 0
}
], - "model": "string",
- "usage": {
- "prompt_tokens": 0,
- "total_tokens": 0,
- "completion_tokens": 0
}
}
Lists all models in the model directory.
Requires an admin key to see all models.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
]
}
Lists all models in the model directory.
Requires an admin key to see all models.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
]
}
Returns the currently loaded model.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
Lists all draft models in the model directory.
Requires an admin key to see all draft models.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
]
}
Loads a model into the model container. This returns an SSE stream.
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
model_name required | string (Model Name) Aliases: name |
Max Seq Len (integer) or Max Seq Len (null) (Max Seq Len) Leave this blank to use the model's base sequence length | |
Cache Size (integer) or Cache Size (null) (Cache Size) Number in tokens, must be greater than or equal to max_seq_len | |
Tensor Parallel (boolean) or Tensor Parallel (null) (Tensor Parallel) | |
Gpu Split Auto (boolean) or Gpu Split Auto (null) (Gpu Split Auto) | |
Array of Autosplit Reserve (numbers) or Autosplit Reserve (null) (Autosplit Reserve) | |
Array of Gpu Split (numbers) or Gpu Split (null) (Gpu Split) | |
Rope Scale (number) or Rope Scale (null) (Rope Scale) Automatically pulled from the model's config if not present | |
Rope Alpha (number) or "auto" (string) or Rope Alpha (null) (Rope Alpha) Automatically calculated if set to "auto" | |
Cache Mode (string) or Cache Mode (null) (Cache Mode) | |
Chunk Size (integer) or Chunk Size (null) (Chunk Size) | |
Prompt Template (string) or Prompt Template (null) (Prompt Template) | |
Num Experts Per Token (integer) or Num Experts Per Token (null) (Num Experts Per Token) | |
DraftModelLoadRequest (object) or null | |
Skip Queue (boolean) or Skip Queue (null) (Skip Queue) Default: false |
{- "model_name": "string",
- "max_seq_len": 4096,
- "cache_size": 4096,
- "tensor_parallel": true,
- "gpu_split_auto": true,
- "autosplit_reserve": [
- 0
], - "gpu_split": [
- 24,
- 20
], - "rope_scale": 1,
- "rope_alpha": 1,
- "cache_mode": "string",
- "chunk_size": 0,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft_model": {
- "draft_model_name": "string",
- "draft_rope_scale": 0,
- "draft_rope_alpha": 1,
- "draft_cache_mode": "string"
}, - "skip_queue": false
}
{- "model_type": "model",
- "module": 0,
- "modules": 0,
- "status": "string"
}
Downloads a model from HuggingFace.
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
repo_id required | string (Repo Id) |
repo_type | string (Repo Type) Default: "model" |
Folder Name (string) or Folder Name (null) (Folder Name) | |
Revision (string) or Revision (null) (Revision) | |
Token (string) or Token (null) (Token) | |
include | Array of strings (Include) |
exclude | Array of strings (Exclude) |
Chunk Limit (integer) or Chunk Limit (null) (Chunk Limit) | |
Timeout (integer) or Timeout (null) (Timeout) |
{- "repo_id": "string",
- "repo_type": "model",
- "folder_name": "string",
- "revision": "string",
- "token": "string",
- "include": [
- "string"
], - "exclude": [
- "string"
], - "chunk_limit": 0,
- "timeout": 0
}
{- "download_path": "string"
}
Lists all LoRAs in the lora directory.
Requires an admin key to see all LoRAs.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "lora",
- "created": 0,
- "owned_by": "tabbyAPI",
- "scaling": 0
}
]
}
Lists all LoRAs in the lora directory.
Requires an admin key to see all LoRAs.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "lora",
- "created": 0,
- "owned_by": "tabbyAPI",
- "scaling": 0
}
]
}
Returns the currently loaded loras.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "lora",
- "created": 0,
- "owned_by": "tabbyAPI",
- "scaling": 0
}
]
}
Loads a LoRA into the model container.
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
required | Array of objects (Loras) |
skip_queue | boolean (Skip Queue) Default: false |
{- "loras": [
- {
- "name": "string",
- "scaling": 1
}
], - "skip_queue": false
}
{- "success": [
- "string"
], - "failure": [
- "string"
]
}
Lists all embedding models in the model directory.
Requires an admin key to see all embedding models.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "object": "list",
- "data": [
- {
- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
]
}
Returns the currently loaded embedding model.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "id": "test",
- "object": "model",
- "created": 0,
- "owned_by": "tabbyAPI",
- "logging": {
- "log_prompt": false,
- "log_generation_params": false,
- "log_requests": false
}, - "parameters": {
- "max_seq_len": 0,
- "rope_scale": 1,
- "rope_alpha": 1,
- "cache_size": 0,
- "cache_mode": "FP16",
- "chunk_size": 2048,
- "prompt_template": "string",
- "num_experts_per_token": 0,
- "draft": { }
}
}
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
embedding_model_name required | string (Embedding Model Name) Aliases: name |
Embeddings Device (string) or Embeddings Device (null) (Embeddings Device) Default: "cpu" |
{- "embedding_model_name": "string",
- "embeddings_device": "cpu"
}
{- "model_type": "model",
- "module": 0,
- "modules": 0,
- "status": "string"
}
Encodes a string or chat completion messages into tokens.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
add_bos_token | boolean (Add Bos Token) Default: true |
encode_special_tokens | boolean (Encode Special Tokens) Default: true |
decode_special_tokens | boolean (Decode Special Tokens) Default: true |
required | Text (string) or Array of Text (objects) (Text) |
{- "add_bos_token": true,
- "encode_special_tokens": true,
- "decode_special_tokens": true,
- "text": "string"
}
{- "tokens": [
- 0
], - "length": 0
}
Decodes tokens into a string.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
add_bos_token | boolean (Add Bos Token) Default: true |
encode_special_tokens | boolean (Encode Special Tokens) Default: true |
decode_special_tokens | boolean (Decode Special Tokens) Default: true |
tokens required | Array of integers (Tokens) |
{- "add_bos_token": true,
- "encode_special_tokens": true,
- "decode_special_tokens": true,
- "tokens": [
- 0
]
}
{- "text": "string"
}
Switch the currently loaded template.
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
prompt_template_name required | string (Prompt Template Name) Aliases: name |
{- "prompt_template_name": "string"
}
null
List all currently applied sampler overrides.
Requires an admin key to see all override presets.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "selected_preset": "string",
- "overrides": { },
- "presets": [
- "string"
]
}
List all currently applied sampler overrides.
Requires an admin key to see all override presets.
x-api-key | string (X-Api-Key) |
authorization | string (Authorization) |
{- "selected_preset": "string",
- "overrides": { },
- "presets": [
- "string"
]
}
Switch the currently loaded override preset
x-admin-key | string (X-Admin-Key) |
authorization | string (Authorization) |
Preset (string) or Preset (null) (Preset) Pass a sampler override preset name | |
Overrides (object) or Overrides (null) (Overrides) Sampling override parent takes in individual keys and overrides. Ignored if preset is provided. |
{- "preset": "string",
- "overrides": {
- "top_p": {
- "force": false,
- "override": 1.5
}
}
}
null