Serverless AI Model APIs

Number of APIs: 2

MonsterAPI brings you a cutting-edge Large Language Model (LLM) API service that brings together a selection of the most advanced open-source models available on Hugging Face.

Our service provides you with access to diverse capabilities and a range of options to suit your specific needs, whether you're developing applications in natural language processing, chatbots, content generation, or any other area where LLMs can be applied.

Supported Models:

Please note that below pricing is beta pricing and actual production pricing might change, we are aiming to bring it further down.

TinyLlama/TinyLlama-1.1B-Chat-v1.0: TinyLlama 1.1B is a compact, efficient large language model with 1.1 billion parameters, trained on 3 trillion tokens, suitable for applications with limited computational resources.
Microsoft/phi3: Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.
Mistralai/Mistral-7B-Instruct-v0.2: The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.
Meta-llama/Meta-Llama-3-8B-Instruct: Meta released the Llama 3 instruction tuned models that are optimized for dialogue use cases. They outperform many of the available open source chat models on common industry benchmarks.

Detailed Pricing:

Model Name	Input Token Price (per 1K tokens)	Output Token Price (per 1K tokens)
TinyLlama/TinyLlama-1.1B-Chat-v1.0	$0.00002818	$0.00028184
microsoft/phi-3	$0.00003483	$0.00034834
mistralai/Mistral-7B-Instruct-v0.2	$0.00004559	$0.00045591
meta-llama/Meta-Llama-3-8B-Instruct	$0.00004559	$0.00045591

Why Choose MonsterAPI?

Diversity of Options: With models ranging from specific tasks to those designed for general applications, our service ensures you have the right tool for every job.
Transparent Pricing: Our clear, token-based pricing model means you only pay for what you use, with no hidden fees or charges.
Ease of Use: Accessing our models is straightforward, with comprehensive documentation and support to help you integrate our API seamlessly into your applications.
Open-Source Models: We believe in the power of open-source technology. All the models listed in our LLM Inference service are hosted on our scalable, security compliant and affordable GPU cloud. Thus, ensuring low cost and scalable access of OSS models.

For more information, visit our website or reach out to our support team. Let MonsterAPI empower your applications with the latest in LLM technology.

Beta throttle limits for different MonsterAPI plans:

Since the service is in beta, we are providing reduced throttle limit support.

Plan	Requests per 60 seconds	Daily Limit of API Calls
Free	10	14,400
Wolf	20	28,800
Beast	40	57,600
Monster	60	86,400

Get Started:

To get started:

Sign up on MonsterAPI and get free trial credits.
Send requests on /v1/generate endpoint with your desired payload and model to get text generation outputs.

Support:

For any query complaint or suggestion please contact us at support@monsterapi.ai .

Text Generation LLM APIs - Generate POST {{baseUrl}}/v1/generate
Text Generation LLM APIs - Get Models Pricing GET {{baseUrl}}/v1/models/pricing