I tried Google Gemini Nano on Bard and this is how it behaved

Sweta GuptaDec 07, 2023 | 14:31

You can test Bard using Gemini Pro right now for text-based questions and is available in 170 countries in English. (Photo Credits: Twitter/Google)

0
0
- Copied!

Google has unveiled Gemini, its latest and most potent multimodal general AI model.

This advanced AI is now accessible worldwide through platforms like Bard, specific developer platforms, and the newly-released Google Pixel 8 Pro devices.

I experimented with Gemini Nano, available on Bard, exploring its content summarisation, image recognition, and voice-to-text interpretation features.

Seeing some qs on what Gemini *is* (beyond the zodiac :). Best way to understand Gemini’s underlying amazing capabilities is to see them in action, take a look ⬇️ pic.twitter.com/OiCZSsOnCc
— Sundar Pichai (@sundarpichai) December 6, 2023

What is Gemini?

Gemini is a large language model (LLM) developed by Google's DeepMind division, with the aim of competing with other AI systems like OpenAI's ChatGPT and potentially surpassing them.

Key features of Gemini

Gemini can process various types of information, including text, images, and more.
This enables it to engage in conversations and recognize real-time video content effectively.
It is part of Google's new generation of super-smart models that utilize Pathways, Google's innovative AI infrastructure. This suggests that Gemini may be one of the largest language models ever developed.
Categorised as one of the "next-generation multimodal models," Gemini is presumed to be among the most extensive language models created thus far.
Gemini is available in different versions, each with its unique strengths.
Some versions might utilize memory, perform fact-checking via Google Search, and continuously improve learning to enhance accuracy and safety over time.

We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google’s largest and most capable AI model.

Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵 https://t.co/mwHZTDTBuG pic.twitter.com/zfLlCGuzmV
— Google DeepMind (@GoogleDeepMind) December 6, 2023

I experimented with Gemini Nano, accessible on Bard, testing its text, image, and voice response functionalities.

Here's how it responded in each mode:

When prompted to summarise 'David Copperfield,' Bard provided a structured breakdown by chapters along with pertinent links to related topics.

The voice-to-text feature operates effectively. When a spoken question is recorded, it accurately appears as text in the search bar, promptly generating the desired search results.

When testing the image identification feature, it successfully recognized generic images. However, it faced challenges in identifying individuals depicted in the provided images.

When prompted to identify a picture of Prime Minister Narendra Modi, it responded with, 'I can't help with images of people yet,' indicating its inability to recognise individuals.

Even when requesting identification of images depicting landmarks like India Gate and the Colosseum in Rome, the AI appeared unfamiliar with identifying places accurately.

Plans are in place to enhance and extend these features, making them accessible across all devices.

Three sizes of Gemini

Gemini comes in three sizes to match different needs.

Gemini Ultra, the biggest and most powerful model, is designed for really tough tasks.
Right now, it's only available to selected customers, developers, partners, and safety experts for early testing.
It will be released to developers and business customers early next year.

Introducing Gemini 1.0, our most capable and general AI model yet. Built natively to be multimodal, it’s the first step in our Gemini-era of models. Gemini is optimized in three sizes - Ultra, Pro, and Nano

Gemini Ultra’s performance exceeds current state-of-the-art results on… pic.twitter.com/pzIw6iCPPN
— Sundar Pichai (@sundarpichai) December 6, 2023

Gemini Pro is great for handling a wide variety of tasks and is already accessible to regular users through Bard.
On Bard, there's a special version of Gemini Pro in English that's fine-tuned for advanced reasoning, planning, and understanding.
Developers and business customers can use Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Now Gemini Pro is coming today in Bard’s biggest update yet (in English in 170 countries) with more advanced reasoning and understanding in the responses. Bard Advanced with Ultra, our most general and capable model for highly complex tasks, is coming early next year.… pic.twitter.com/x6W90HJMJw
— Sundar Pichai (@sundarpichai) December 6, 2023

Gemini Nano is for on-device tasks and is already working on Pixel 8 Pro, bringing new features like Summarize in the recorder app and smart reply in Gboard, starting with WhatsApp.
Starting December 13, Android developers can also use Gemini Nano through AICore, a new feature in Android 14, starting on Pixel 8 Pro devices.

Gemini Nano is super efficient for tasks that are on-device. Android developers can sign up for an early access program for Gemini Nano via Android AICore and Pixel 8 Pro users can already see it rolling out in features like Summarize in Recorder and Smart Reply in Gboard + much… pic.twitter.com/KFIei4D9Pc
— Sundar Pichai (@sundarpichai) December 6, 2023

Currently, Gemini is accessible in English across over 170 countries and territories. It is slated for expansion into additional languages and regions, including Europe, in the near future.

Last updated: December 07, 2023 | 14:31

0
0
- Copied!

IN THIS STORY

Please log in

I agree with DailyO's privacy policy