Omega Healthcare - Model Card

Model Overview

Name Omega Healthcare Model
Type Auto-regressive language model
Last Updated 10 August 2024

Model Details

Architecture Based on Autoregressive Transformer Architecture, with expert routing capabilities
Training Data 5.3 trillion tokens from public web, books, and code
Multilingual Capability
  • Understands & Generates in 100+ human languages
  • Proficient in generating output in 107+ languages, including 27 human languages and 80 programming languages.
  • Multilingual Training Data: 7% of total training data

Training Process

Stage Description
Foundation Training Two-stage approach:
  • Stage 1: Mixed dataset (web, code, math, general).
  • Stage 2: Mixed data + Textbooks data (inspired by Microsoft's "Textbooks Are All You Need").
Post-Training Syntactic dataset categorized into multiple domains
Human Alignment Direct Preference Optimization (DPO) using 100% syntactic data

Training Data Distribution

Post-Training Language Distribution

English 40%
Multilingual 60%

Domain-Specific Training Data

Math17.35%
Multiple Choice5.28%
General11.46%
Creative Writing9.80%
Role Play1.23%
Code24.27%
Task8.37%
Dialog1.31%
Context-based Q&A15.56%
Official Writing1.21%
Translation4.01%

Evaluation Metrics

MMLU (5 Shot)82.3
HumanEval (code)85.4
MBPP (Code)79.2
Truthful QA59.3
GSM8K School Grade89.7
Math (problem solving)57.8

Key Features

Intended Healthcare Use Cases

Usage Considerations

Ethical Framework

Fairness and Representation

Safety Protocols

Bias Mitigation Strategy

Recommendations for Use

Future Work

Citations

Note: Metrics obtained using our multiple expert model architecture, maintaining quality while reducing costs. Benchmarks may vary based on expert routes, evaluation strategies, and methods.