IBM Granite

2023 text-generating language model
  • Multimodal
  • Large language model
  • Generative pre-trained transformer
  • Foundation model
LicenseProprietary
Code models: Open Source (Apache 2.0)[2]
Part of a series on
Machine learning
and data mining
Paradigms
  • Supervised learning
  • Unsupervised learning
  • Online learning
  • Batch learning
  • Meta-learning
  • Semi-supervised learning
  • Self-supervised learning
  • Reinforcement learning
  • Curriculum learning
  • Rule-based learning
  • Quantum machine learning
Learning with humans
Machine-learning venues
  • v
  • t
  • e

IBM Granite is a series of decoder-only foundation models created by IBM. It was announced on September 7, 2023,[3][4] and an initial paper was published 4 days later.[5] Initially intended for use in the IBM's cloud-based data and generative AI platform Watsonx along with other models,[6] IBM opened the source code of some code models.[7] Granite models are trained on datasets curated from Internet, academic publishings, code datasets, legal and finance documents.[8][9][1]

Foundation models

A foundation model is an AI model trained on broad data at scale such that it can be adapted to a wide range of downstream tasks.[10]

Granite's first foundation models were Granite.13b.instruct and Granite.13b.chat. The "13b" in their name comes from 13 billion, the amount of parameters they have as models, lesser than most of the larger models of the time. Later models vary from 3 to 34 billion parameters.[3][11]

On May 6, 2024, IBM released the source code of four variations of Granite Code Models and put them on Hugging Face for public use.[12] According to IBM's own report, Granite 8b outperforms Llama 3 on several coding related tasks within similar range of parameters,[13] whereas readme page on Hugging Face for Granite 7b base model states it barely outperforms Llama 2 7b.[14]

See also

References

  1. ^ a b McDowell, Steve. "IBM's New Granite Foundation Models Enable Safe Enterprise AI". Forbes.
  2. ^ ibm-granite/granite-code-models, IBM Granite, 2024-05-08, retrieved 2024-05-08
  3. ^ a b Nirmal, Dinesh (September 7, 2023). "Building AI for business: IBM's Granite foundation models". IBM.
  4. ^ "IBM debuts Granite series of hardware-efficient language models". September 7, 2023.
  5. ^ "Granite Foundation Models" (PDF). IBM. 2023-11-30.
  6. ^ Fritts, Harold (2024-04-22). "IBM Adds Meta Llama 3 To watsonx, Expands AI Offerings". StorageReview.com. Retrieved 2024-05-08.
  7. ^ Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-08.
  8. ^ Azhar, Ali (2024-04-08). "IBM Patents a Faster Method to Train LLMs for Enterprises". Datanami. Retrieved 2024-05-08.
  9. ^ Wiggers, Kyle (2023-09-07). "IBM rolls out new generative AI features and models". TechCrunch. Retrieved 2024-05-08.
  10. ^ "Introducing the Center for Research on Foundation Models (CRFM)". Stanford HAI. 18 August 2021.
  11. ^ Pawar, Sahil (2023-09-11). "IBM Introduces Granite Series LLM Models for Watsonx Platform". Analytics Drift. Retrieved 2024-05-09.
  12. ^ Nine, Adrianna (May 7, 2024). "IBM Makes Granite AI Models Open-Source Under New InstructLab Platform". ExtremeTech.
  13. ^ Jindal, Siddharth (2024-05-07). "IBM Releases Open-Source Granite Code Models, Outperforms Llama 3". Analytics India Magazine. Retrieved 2024-05-09.
  14. ^ "README.md · ibm/granite-7b-base at main". huggingface.co. 2024-04-19. Retrieved 2024-05-09.

External links

  • GitHub page
  • Hugging Face page
  • v
  • t
  • e
History
Products
Hardware
Current
Former
Other
Business
entities
Current
Former
Facilities
Initiatives
Inventions
Terminology
CEOs
Board of
directors
Other
  • Category
  • Commons
  • Navigational boxes
    • FOSS
    • Midrange computers
    • Operating systems
    • Personal computers
    • System/360
    • System/370
    • Typewriters
    • Vacuum tube computers
  • v
  • t
  • e
Differentiable computing
General
Concepts
Applications
Hardware
Software libraries
Implementations
Audio–visual
Verbal
Decisional
People
Organizations
Architectures
  • Portals
    • Computer programming
    • Technology
  • Categories
    • Artificial neural networks
    • Machine learning