Re: Is the Gemini model the new king?

L_Skorwider · ‎12-07-2023

Hello Group,

I'm curious if you've seen the presentation of Gemini - the new multi-modal model from Google. It looks very promising, doesn't it? I think it has a chance to be in the spotlight for a few weeks. In my opinion, this is a big step for a company that was a little behind in this race.

It's very impressive to me. Especially the velocity at which AI technology is developing. Where will it take us? What is your strategy as experts, specialists, or maybe just juniors in the SAP field? How do you see your profession in a year, 10 years from now? How will you steer your career in a changing world?

And yes, it's just a demo, and the reality won't be that great. But all indications are that GPT4 has been beaten.

I am curious about your thoughts.

L_Skorwider · ‎12-14-2023

What a shame. What's going on in this AI world lately. It started out so high-minded, and came out as usual.

G has been accused and even admitted that this Gemini demo is not quite real. It looks like it's just a video assembly showing what AI might be like if it could really react in real time. But it can't, and some have mentioned that it was a slight misrepresentation, with others using even stronger words. Much stronger.

Instead, GPT4 has reached the next level of development! It can finally be described as almost human intelligence, well almost. This is a great step forward for all of humanity. By the developers' own admission, GPT4 has become lazy! How human it is!

DiegoDora · ‎12-16-2023

Hi @L_Skorwider !

I don't think that there is enough proof yet to assume GEMINI is a better model than GPT4V... BUT I do think that the feature of foundational models should be Multi Modal or at least contemplate these type of capabilities.

Personally I didn't feel defrauded or mislead by GEMINI's video because I know how these models work and real time for video processing is just not there yet... or it's extremely expensive.

The lazyness on GPT4 sounds like someone has been playing with the weights in production. Interesting to continue to follow the field and see what Anthropic, Facebook and others have to offer.

Thanks for sharing!

Diego.

L_Skorwider · ‎12-16-2023

Hi Diego,

Yes, the Gamini demos presented what multi-modality should look like. I agree that this is the unavoidable future. And even if there will still be very specialized single-modal models, aimed at some very specific target, in my opinion, all the rest will be multi-modal. Or maybe someone will make such progress that all the rest will no longer be needed? I'm keeping my fingers crossed that no one will monopolize this field.

But for the time being, I am very positively surprised by what can be done going in the opposite direction. Recently, Mixtral was presented - a model that is a combination of eight small LLMs. Each of the component models has no more than 7 billion parameters. Two are selected for response. Already what the predecessor of this product, Mistral 7b, presented was very interesting, looking at its size, but Mixtral is seriously great. And thanks to the small models it's very fast. Yes, it's rather a dead end and the opposite of multi-modality, but still very interesting.

Back to Gemini. Well, we seemingly knew it didn't look like that, but a big distaste remained. One, that any creativity in the presentation should be clearly communicated. Two, that not everyone knows what to expect. Three, you never know what progress was made overnight. Meanwhile, the whole situation is not improved by the fact that Gemini Ultra, on which the demo was based, is not available. And there's no guarantee when it will be. Then we may already have a new king. And looking at the speed of development - we probably already will have.

Yes, the message about laziness just made me laugh. Hence the slightly sarcastic tone, among the mass of seriousness in the respectful topic of AI.

Thanks for the reply. 🙂

sjochen · ‎12-21-2023

It's hard to judge in a general way which model is better. The last couple of months showed that the performance and quality of a model highly depends on the kind of tasks it is used for.

I'm pretty sure that in the mid run, it will be hard for general purpose or world models to truly differentiate via outcome quality.

The interesting question still is what will then make the difference?

L_Skorwider · ‎12-29-2023

I think that we can assume that the quality is much more important than performance in the long run. Especially for business applications.

Well, it's possible to measure which model offers better quality. There are many measures which allow to evaluate and compare language models in different dimensions. But in my opinion not all of them are equally important for business application. If I had to pick two aspects of LLM operations that can be critical in business applications, I would say they are logical reasoning and mathematical problem-solving skills.

But of course, this mainly depends on the type of business and the specific case. In many specific implementations, these features can be completely secondary.

On the other hand, any prediction of a winner based on the status quo with such dynamic development of the field is just a blind guess. Technological advantage is one thing, while knowledge sources for training models will play an increasing role. The NYT's lawsuit against OpenAI and Microsoft clearly illustrates this.