Omega Healthcare - Model Card
Model Overview
| Name |
Omega Healthcare Model |
| Type |
Auto-regressive language model |
| Last Updated |
10 August 2024 |
Model Details
| Architecture |
Based on Autoregressive Transformer Architecture, with expert routing capabilities |
| Training Data |
5.3 trillion tokens from public web, books, and code |
| Multilingual Capability |
- Understands & Generates in 100+ human languages
- Proficient in generating output in 107+ languages, including 27 human languages and 80 programming languages.
- Multilingual Training Data: 7% of total training data
|
Training Process
| Stage |
Description |
| Foundation Training |
Two-stage approach:
- Stage 1: Mixed dataset (web, code, math, general).
- Stage 2: Mixed data + Textbooks data (inspired by Microsoft's "Textbooks Are All You Need").
|
| Post-Training |
Syntactic dataset categorized into multiple domains |
| Human Alignment |
Direct Preference Optimization (DPO) using 100% syntactic data |
Training Data Distribution
Post-Training Language Distribution
| English |
40% |
| Multilingual |
60% |
Domain-Specific Training Data
| Math | 17.35% |
| Multiple Choice | 5.28% |
| General | 11.46% |
| Creative Writing | 9.80% |
| Role Play | 1.23% |
| Code | 24.27% |
| Task | 8.37% |
| Dialog | 1.31% |
| Context-based Q&A | 15.56% |
| Official Writing | 1.21% |
| Translation | 4.01% |
Evaluation Metrics
| MMLU (5 Shot) | 82.3 |
| HumanEval (code) | 85.4 |
| MBPP (Code) | 79.2 |
| Truthful QA | 59.3 |
| GSM8K School Grade | 89.7 |
| Math (problem solving) | 57.8 |
Key Features
- Expert routing architecture inspired by human work methods and Google's Mixture of Experts
- Token-level query routing to multiple expert models
- Strong performance in reasoning, coding, and mathematical tasks
- Iterative learning capabilities for new tasks, styles, and languages
Intended Healthcare Use Cases
- Analyzing X-rays to detect abnormalities like fractures or tumors.
- Interpreting MRI images for early signs of neurological disorders
- Generating radiology reports from CT scan data automatically.
- Enhancing ultrasound image clarity using deep learning techniques.
- Providing real-time feedback during interventional radiology procedures.
- Assisting in the classification of mammography results to identify cancer.
- Automating the extraction of quantitative data from PET scans.
- Automating follow-up recommendations based on radiological findings.
- Predicting disease progression using historical radiological imaging data.
- Identifying and classifying lung nodules in chest CT scans.
- Facilitating remote radiology consultations through image-based AI diagnostics.
- Enhancing educational tools for radiology students with interactive models.
- Streamlining the prioritization of urgent cases in radiological workflows.
- Comparing current images with previous scans to track changes.
- Providing second opinions on radiographic interpretations via AI analysis.
- Identifying rare conditions through pattern recognition in radiographic data.
- Optimizing radiation doses based on AI-driven patient modeling.
- Using AI to detect subtle changes in bone density over time.
- Automating the detection of vascular abnormalities in angiography images.
- Developing personalized imaging protocols based on AI predictions.
- Monitoring the effectiveness of ongoing treatments through imaging analysis.
Usage Considerations
- Performance optimized for well-represented languages and domains; may require additional fine-tuning for specialized applications
- Outputs should be reviewed for accuracy and potential biases, especially in sensitive contexts
- Knowledge cutoff date: December 2023, consider supplementing with up-to-date information for time-sensitive tasks
- Designed for augmenting human decision-making rather than autonomous critical operations
- Best suited for tasks within its training scope; may require expert input for highly specialized knowledge domains
Ethical Framework
- Implemented with safeguards to minimize misleading or biased content generation
- Users are encouraged to employ the model responsibly and avoid potential misuse
- Developed with consideration for environmental impact; ongoing efforts to optimize efficiency
Fairness and Representation
- Trained on diverse data to enhance performance across demographic groups and languages
- Users should be aware of potential variations in performance across different contexts
- Continuous improvement process in place to enhance fairness and representation
Safety Protocols
- Designed for human-in-the-loop applications, enhancing rather than replacing human oversight
- Incorporates best practices for secure code generation; additional security reviews recommended for critical applications
- Robust architecture to mitigate vulnerabilities, with ongoing security enhancements
Bias Mitigation Strategy
- We have implemented guardrails during reinforcement learning to mitigate the possibilities of harmful outputs, privacy issues, and other potential concerns
- Our approach includes continuous monitoring and iterative improvements to enhance the model's safety and reliability
- Users are encouraged to provide feedback to help identify and address any remaining biases or issues
Recommendations for Use
- Implement content filtering and safety checks on model outputs
- Provide clear disclaimers to end-users about the model's limitations and potential biases
- Ensure human oversight for critical applications
- Regularly update and fine-tune the model with diverse, high-quality data
Future Work
- Continuous improvement through iterative Reinforcement Learning on diverse, high-quality data
- Further development of expert models for specific domains
- Increasing multilingual and syntactic data for pretraining datasets
- Ongoing bias detection and mitigation efforts
- Research into more energy-efficient training and deployment methods
Citations
- Textbooks Are All You Need
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
- Attention Is All You Need
- Language Models are Few-Shot Learners
- Scaling Laws for Neural Language Models
- Finetuned Language Models Are Zero-Shot Learners
- Huang, Z., Liang, M., Qin, J., Zhong, S., & Lin, L. (2023). Understanding self-attention mechanism via dynamical system perspective. Link
- Fukui, H., Hirakawa, T., Yamashita, T., & Fujiyoshi, H. (2019). Attention Branch Network: Learning of Attention Mechanism for Visual Explanation. Link
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017, June 12). Attention is all you need. arXiv.org. Link
- Alammar, J. (n.d.). The illustrated transformer. Link
- Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., . . . Amodei, D. (2020, May 28). Language Models are Few-Shot Learners. arXiv.org. Link
- Quantifying multilingual performance of large language models across languages. (n.d.). Link
- Levy, S., John, N., Liu, L., Vyas, Y., Ma, J., Fujinuma, Y., Ballesteros, M., Castelli, V., & Roth, D. (2023). Comparing Biases and the Impact of Multilingual Training across Multiple Languages. ACL Anthology. Link
- Liu, J., Wang, B., Shen, X., Qi, Z., & Tian, Y. (2021, May 22). Two-stage Training for Learning from Label Proportions. arXiv.org. Link
- Tursun, H. (2024). Numerical investigation of the effects of process parameters on forming load and failure in hot nosing process. Materials Research Proceedings. Link
- Gunasekar, S., Zhang, Y., Aneja, J., Mendes, C. C. T., Allie, D. G., Gopi, S., Javaheripi, M., Kauffmann, P., Gustavo, D. R., Saarikivi, O., Salim, A., Shah, S., Behl, H. S., Wang, X., Bubeck, S., Eldan, R., Kalai, A. T., Lee, Y. T., & Li, Y. (2023, June 20). Textbooks are all you need. arXiv.org. Link
- Hu, S., Zhou, H., Hergul, M., Gritta, M., Zhang, G., Iacobacci, I., Vulić, I., & Korhonen, A. (2023). Multi 3 WOZ: a multilingual, Multi-Domain, Multi-Parallel dataset for training and evaluating culturally adapted Task-Oriented dialog systems. Transactions of the Association for Computational Linguistics, 11, 1396–1415. Link
- Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C. D., & Finn, C. (2023, May 29). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. arXiv.org. Link
- Mixture-of-Depths: Dynamically allocating compute in transformer-based language models. (n.d.). Link
- Sottana, A., Liang, B., Zou, K., & Yuan, Z. (2023, October 20). Evaluation metrics in the era of GPT-4: Reliably evaluating large language models on sequence to sequence tasks. arXiv.org. Link
- Reddy, S., Chen, D., & Manning, C. D. (2019). COQA: a conversational question answering challenge. Transactions of the Association for Computational Linguistics, 7, 249–266. Link
- Bibas, K., Shalom, O. S., & Jannach, D. (2022, October 21). Collaborative image understanding. arXiv.org. Link
- A Data Set of Syntactic-N Grams over Time from a Very Large Corpus of English Books. (n.d.). static.googleusercontent.com. Link
- ATISS: Autoregressive Transformers for Indoor Scene Synthesis. (n.d.). Neurips. Link
- Gunasekar, S., Zhang, Y., Aneja, J., Cesar, C., Mendes, T., Del Giorno, A., Gopi, S., Javaheripi, M., Kauffmann, P., De Rosa, G., Saarikivi, O., Salim, A., Shah, S., Behl, H. S., Wang, X., Bubeck, S., Eldan, R., Kalai, A. T., Lee, Y. T., & Li, Y. (2024, June 10). Textbooks are all you need - Microsoft Research. Microsoft Research. Link
- Attention Is All You Need. (n.d.). Neurips. Link
- Montesinos, D. M. (2020, September 10). Modern methods for text generation. arXiv.org. Link
- Language Models are Few-Shot Learners. (n.d.). Neurips. Link
- Liu, T., Feng, F., & Wang, X. (2021, July 22). Multi-stage Pre-training over Simplified Multimodal Pre-training Models. arXiv.org. Link
- Schumi, R., & Sun, J. (2022). ExAIS. Proceedings of the 44th International Conference on Software Engineering. Link
Note: Metrics obtained using our multiple expert model architecture, maintaining quality while reducing costs. Benchmarks may vary based on expert routes, evaluation strategies, and methods.