What's up with AI in Feb 2024

Future Telescope 31

Feb 15, 2024

This is an industry in which mere days are equivalent to many months in any other industry. The pace of innovation and creativity in generative AI and its applications is hitting new milestones each day. There are more jobs being created and more jobs being taken away as an outcome of this fresh wave of task-automation that AI has brought us. The world being created is practically unrecognisable for anyone from even 20 years ago. And here we are, at the start of what looks to be an exponential growth curve for humanity and the intelligence it can deploy. Alternatively, it’s just more cloud servers full of GPUs instead of CPUs. Who knows. 🤷🏻‍♂️

To start off, I would recommend you check out these two conversations from Davos:

One of these is a conversation from the point of view of an investor and innovator in AI - Microsoft. Another is a conversation from the point of view of the man who has become the face of today’s AI boom - Sam Altman. I think I was really struck by one statement from Sam, which went:

“My whole model of the world is that the two important currencies of the future are compute/intelligence and energy, um you know the ideas that we want and the ability to make stuff happen.”

The ideas that we want and the ability to make stuff happen - seeing the needs of the world distilled in such clear terms is a definite eye-opener and helps one think long term. For views like these, the interviews shared above are great.

Also great - the annoyance of Sam Altman when asked about all the OpenAI drama. Delightful.

1. Chat Away…

Apart from the launch of the GPT Store, which in theory could become the largest monetization opportunity for OpenAI leading to repeat purchases and longer customer lifetime values - OpenAI also stands to benefit with the expansion of its available developer community thanks to the natural language inputs that code a custom GPT.

This is all looking great, but not a lot has come out of OpenAI since the launch of the GPT store, apart from rumors about GPT 5 and some hardware indulgences of Sam Altman. That’s where we are on the OpenAI side of things.

On the other hand is competition. Most effectively from Google, who is 11 months late to the party, releasing the model Gemini Ultra (packaged as “Gemini Advanced” in its marketing) in Feb 2024, to compete with GPT4 which came out in March 2023. While the product maybe late to the game, it brings with itself useful tricks like the ability to seamlessly integrate with Gmail, Google Calendar, Youtube, Maps, etc.

But I think the most stark thing about Gemini Advanced is the fact that Google can just push insane value to its customers at the same price that ChatGPT Premium costs each month ($20) - you get all the Google One Premium benefits along with a GPT 4 class LLM that you can chat with without any limits! (GPT 4 still has a cap of 40 messages every 3 hours). Sure, you miss out on the GPT Store and its custom GPTs, and you can only use it individually instead of password-sharing with someone else, but hey, you get all of these additional benefits:

However, the most exciting development for me has been the emergence of user-friendly on-device LLMs. Basically these are services which allow you to install their software locally on your computer, and start chatting with various chatbot models. Jan is a recent example, allowing you to run various cutting-edge models locally on your Mac or Windows machine, leading to the magical feeling of using a generative AI chatbot while offline!

The model selection page of Jan, an LLM interface that allows you to use generative AI chat models locally on your computer. Just download a model, select it for use, and start chatting with it right away! No internet needed after the download.

If you are up to subscribe to the latest and greatest chatbots, here are a couple of reviews to help you choose between Gemini Ultra and ChatGPT:

On the topic of LLM chatbots, I’m curious to know…

2. Create Away…

While I was gone for 6 months, Midjourney came up with its v6, which boasts better understanding of natural language prompts, better positional understanding of the prompts, and features like upscale and pan. Oh, and if you’ve created more than a thousand images on the platform, you are eligible to join their alpha web interface now!

I created all the images in this month’s AI art showcase with Midjourney v6, check them out!

AI Art Showcase 4

Punit Thakkar

February 1, 2024

Read full story

In addition, there are so many new tools that have come out to spice up your creativity:

1. Livesketch makes animated sketches from text prompts:

Look at that adorable penguin on the right side! That’s what Livesketch can make.

2. Lumiere from Google Research has exceptional image to video output:

Superb image to video recreation by Lumiere

Here’s more about Lumiere from one of its creators:

Click the image to read the thread and see Lumiere’s abilities

3. RPG-Diffusion Master is great at creating super-specific text to image output

Specific prompting output from RPG-Diffusion Master. This was the prompt: “From left to right, a blonde ponytail Europe girl in white shirt, a brown curly hair African girl in blue shirt printed with a bird, an Asian young man with black short hair in suit are walking in the campus happily.”

4. Moondream is a small vision model, similar to Jan above because it helps vision models run on any device locally. If you are developer, you could make a killing with a Google Lens like service built on top of this model. It could be great for vision-impaired users too! Check it out.

Moondream and the responses from its vision model

Finally…

It was great to write another Future Telescope article after such a looooong time. I hope I have given you enough rabbit holes to go down on in this post, and I would LOVE to hear from you on what was your favorite topic / tool / video from this post.

In closing, I’m quoting Andrej Karpathy, who I featured way back in the very second edition of Future Telescope (btw we are one year old now), where he was talking about creating your own GPT. Andrej has been an AI rockstar, going from Google to Stanford to OpenAI to Tesla and back to OpenAI, and as of two days ago, out of OpenAI again to be an AI educator. Here’s what he said a few days before he moved out of OpenAI:

# on shortification of "learning"
There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience, fame and revenue. But as far as learning goes, this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws", which feel like you're eating healthy vegetables until you look at the ingredients.
Learning is not supposed to be fun. It doesn't have to be actively not fun either, but the primary feeling should be that of effort. It should look a lot less like that "10 minute full body" workout from your local digital media creator and a lot more like a serious session at the gym. You want the mental equivalent of sweating. It's not that the quickie doesn't do anything, it's just that it is wildly suboptimal if you actually care to learn.
I find it helpful to explicitly declare your intent up front as a sharp, binary variable in your mind. If you are consuming content: are you trying to be entertained or are you trying to learn? And if you are creating content: are you trying to entertain or are you trying to teach? You'll go down a different path in each case. Attempts to seek the stuff in between actually clamp to zero.
So for those who actually want to learn. Unless you are trying to learn something narrow and specific, close those tabs with quick blog posts. Close those tabs of "Learn XYZ in 10 minutes". Consider the opportunity cost of snacking and seek the meal - the textbooks, docs, papers, manuals, longform. Allocate a 4 hour window. Don't just read, take notes, re-read, re-phrase, process, manipulate, learn.
And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. Give someone a real workout. This is what I aspire to in my own educational work too. My audience will decrease. The ones that remain might not even like it. But at least we'll learn something.

Going forward, I hope Future Telescope ends up being an opportunity for you to give your AI learning a real workout too. I’m also super curious to know: How did you find today’s refreshed Future Telescope? Please share your thoughts in the comments below