technologyneutral
Cohere's New AI Models: Bridging the Language Gap
San Francisco, USASaturday, October 26, 2024
Cohere's approach to avoiding gibberish in AI-generated text is notable. They use data arbitrage to prevent reliance on synthetic data, which can be problematic for low-resource languages. Additionally, they've figured out how to guide models toward global preferences and account for cultural and linguistic diversity, enhancing performance and safety.
The challenge with non-English language models is finding sufficient data for training. English dominates in government, finance, and business, making it easier to gather data. Benchmarking models across different languages is also tricky due to translation quality. Other developers have released datasets to support this research, like OpenAI's dataset covering 14 languages.
Cohere has been active lately, adding image search to Embed 3 and enhancing fine-tuning for its Command R model.
Actions
flag content