The A.I. Thread - Page 27 - Calgarypuck Forums - The Unofficial Calgary Flames Fan Community

I-Hate-Hulse · 12-16-2024, 09:19 AM

Is there an AI application out there where you can upload several PDF's (about 100 pages long each) on a topic, and have AI complete a 200 page PDF questionnaire based on the previously uploaded PDF files? The questionnaire to be answered is part textboxes, and part radio buttons.

I've been trying notebook.lm, which seems to ingest the incoming PDF files OK. But it doesn't seem to support the latter half of this OK - the completion of the PDF questionnaire.

I could feed it the questions to be answered one by one into the text input box, but that's not efficient.

Firebot · 01-27-2025, 07:40 AM

Some pretty massive news in AI land that has some pretty incredible impact worthy of a bump, especially to American AI giants who have convinced investors that the only way to get to AGI was by throwing insane amounts of cash at the problem while keeping all the research private to profit from.

A small team in China called Deepseek, on a pet project, built an LLM thinking model that rivals o1 from Open AI with Deepseek R1. That in itself is quite the feat to see a Chinese company suddenly at the top of the charts, but what is truly remarkable is they built the model with only 6 million dollars in training cost, AND have released the papers detailing the method AND released the model as an open source model. Everyone can now have their own open source LLM thinking model if they have the computing power for it. They are also doing it on inferior chips due to the ban imposed by the US on Nvidia providing chips to China. And anyone can also use their API at a fraction of cost compared to o1 and o3 models.

https://www.reddit.com/r/LocalLLaMA/...ane/?rdt=33050

This is an absolute game changer, and the market is in full panic mode.

NVDA is down 12% in one blow. NASDAQ down 3% so far already. This may well just crash the AI bubble (for the better IMO) as investors wake up to realize they were sold snake oil by salesmen who just lost their secret.

I tried it for myself very briefly, it built a working version of Tetris in one simple prompt in Python.

https://www.deepseek.com/

If you want to try it out. (yes it's heavily censored if you are asking it for China sensitive political topics, but that should be given).

Fuzz · 01-27-2025, 08:01 AM

Here's an article with a lot more detail(and DIY instructions):

https://www.theregister.com/2025/01/...eek_r1_ai_cot/

EDIT: sorry, text got removed.

You can test it here if you don't want to use a Chinese site:
https://lmarena.ai/?arena
Click the "direct chat" tab, then select deepseek-r1 from the first dropdown box.

It did the best job yet of my test question.

Spoiler!

Old Yeller · 01-27-2025, 09:29 AM

Nvidia down 16% now... basically every time I refresh it goes down again.

photon · 01-27-2025, 09:33 AM

I guess the question is even if true is there enough AI work out there for this to be more an adjustment for nvidia than a permanent change. Will this change things fundamentally or does this just slow nvidia's growth as everyone finds more things to do with their hardware?

I.e. buying opportunity or not.

sa226 · 01-27-2025, 10:04 AM

I sold my AVGO position this morning. It was a long term growth hold for me, and I was already up a whole bunch with it so I'm okay with that.

Generally its a good idea to lock in profits and average out, but I'm not a trader and don't have time to watch it that closely.

Theres just too much emotion out there right now. If it was an over reaction of the market I'll re-evaluate and see if I want to jump back in. If it keeps free falling, it could be a good buying opportunity. I'd just rather watch whatever this is from the sidelines.

Just realized this wasn't the stock market thread. Whatevs.

Azure · 01-27-2025, 10:25 AM

Quote:

Originally Posted by Firebot

Some pretty massive news in AI land that has some pretty incredible impact worthy of a bump, especially to American AI giants who have convinced investors that the only way to get to AGI was by throwing insane amounts of cash at the problem while keeping all the research private to profit from.

A small team in China called Deepseek, on a pet project, built an LLM thinking model that rivals o1 from Open AI with Deepseek R1. That in itself is quite the feat to see a Chinese company suddenly at the top of the charts, but what is truly remarkable is they built the model with only 6 million dollars in training cost, AND have released the papers detailing the method AND released the model as an open source model. Everyone can now have their own open source LLM thinking model if they have the computing power for it. They are also doing it on inferior chips due to the ban imposed by the US on Nvidia providing chips to China. And anyone can also use their API at a fraction of cost compared to o1 and o3 models.

https://www.reddit.com/r/LocalLLaMA/...ane/?rdt=33050

This is an absolute game changer, and the market is in full panic mode.

NVDA is down 12% in one blow. NASDAQ down 3% so far already. This may well just crash the AI bubble (for the better IMO) as investors wake up to realize they were sold snake oil by salesmen who just lost their secret.

I tried it for myself very briefly, it built a working version of Tetris in one simple prompt in Python.

https://www.deepseek.com/

If you want to try it out. (yes it's heavily censored if you are asking it for China sensitive political topics, but that should be given).

lol, sure.

Russic · 01-27-2025, 10:33 AM

Quote:

Originally Posted by Azure

lol, sure.

By virtue of it being open source, it's fairly straightforward to check their work (as I understand it). I don't see too many people questioning the validity of the claims.

Azure · 01-27-2025, 12:34 PM

Quote:

Originally Posted by Russic

By virtue of it being open source, it's fairly straightforward to check their work (as I understand it). I don't see too many people questioning the validity of the claims.

Quote:

DeepSeek's impact on the AI technology supply chain "all depends on the veracity of DeepSeek's claims," Truist Securities analyst William Stein said in a client note Monday.

DeepSeek alleges that it built its AI model with "with inferior chips and at a fraction of the cost" of U.S. models, Stein said. "We cannot determine the veracity of DeepSeek's claims. If they're true, it magnifies the already-known risk of an AI spending slowdown."

DeepSeek claims it used 2,048 Nvidia H800 chips, a downgraded version of Nvidia's H100 chips designed to comply with U.S. export restrictions. However, some industry officials are skeptical of that claim, saying DeepSeek has more Nvidia processors and more advanced versions than it is letting on.

https://www.investors.com/news/techn...deepseek-news/

I mean the training might be true, but seems pretty dumb to think that they are this far advanced with far lessor GPUs.

opendoor · 01-27-2025, 01:13 PM

Quote:

Originally Posted by Azure

https://www.investors.com/news/techn...deepseek-news/

I mean the training might be true, but seems pretty dumb to think that they are this far advanced with far lessor GPUs.

I don't know, I don't find it that implausible, particularly with how a bunch of people in the US are panicking and talking about the CCP and Tiananmen Square to distract from it. If it wasn't a real threat to US dominance, the reaction would probably be different. I mean, look at innovation in EVs. China has a bunch of automakers all producing affordable EVs while the US has basically one company that can actually produce them, and then a bunch of dinosaurs who can't innovate. I don't see why this space would be any different given how bloated US tech companies are.

And even if you set aside how it was developed, it's clearly more efficient in terms of resources when in use. And anyone can run it locally. Hopefully this progress toward open source models continues.

Fuzz · 01-27-2025, 01:17 PM

This was always going to happen at some point. Current models are just brute force(more data, more compute), it shouldn't be shocking that someone developed a more elegant method. It seems pretty dumb to think that the current path was going to continue to be the correct one.

Shazam · 01-27-2025, 01:26 PM

Quote:

Originally Posted by Fuzz

This was always going to happen at some point. Current models are just brute force(more data, more compute), it shouldn't be shocking that someone developed a more elegant method. It seems pretty dumb to think that the current path was going to continue to be the correct one.

There is nothing novel about DS.

It is cheaper because there is no fine tuning done by humans.

https://www.seangoedecke.com/deepseek-r1

Fuzz · 01-27-2025, 01:33 PM

Quote:

Originally Posted by Shazam

There is nothing novel about DS.

It is cheaper because there is no fine tuning done by humans.

https://www.seangoedecke.com/deepseek-r1

Quote:

In short, this is a reinforcement learning approach, not a fine-tuning approach. There’s no need to generate a huge body of chain-of-thought data ahead of time, and there’s no need to run an expensive answer-checking model. Instead, the model generates its own chains-of-thought as it goes2. There are other points made in the DeepSeek-R1 paper, but I think this is by far the most important.

Quote:

Addendum: this is a relatively straightforward approach that others must have thought of. Why did it happen now and not a year ago? The most compelling answer is probably this: open-source base models had to get good enough at reasoning that they could be RL-ed into becoming reasoning models. It’s plausible that a year ago that wasn’t the case. A less compelling answer: the quality of reasoning-based benchmarks is much higher now than it was. For this approach to work, you need to be able to feed the model a ton of problems that require reasoning to solve (otherwise it’ll jump straight to the solution). Maybe those problems have only recently become available.

Thanks for the link. Sounds like it is a bit novel, as they used a different feedback mechanism that doesn't appear to have caused any major issues. This will save a lot of training/money and I'd expect it to get refined further.

Jason14h · 01-27-2025, 01:57 PM

It’s not just this model - it’s the breakthrough that people thought was months or years away minimum

This new method will be improved and iterated on now which will reduce the resources required to train and run . Thus reducing the chips people need to buy

This stuff was already moving exponentially fast , so a breakthrough like this can cause exponential ripples

activeStick · 01-27-2025, 02:45 PM

And it's been hit by a cyber attack - my guess is by the US given its impact on the American AI industry - within hours of its release.

https://www.kron4.com/news/national/...-cyber-attack/

Quote:

Chinese AI startup DeepSeek announced Monday that it had been hit by a large-scale cyberattack. The one-year-old startup launched its R1#AI model last week and has since drawn comparisons to OpenAI.

On Monday, DeepSeek announced that it would be temporarily limiting registrations to “ensure continued service.” Existing users, according to DeepSeek, are still able to log in as usual.

Fuzz · 01-27-2025, 02:58 PM

I wonder if it was a cyber attack, or just everyone trying to check it out.

Russic · 01-28-2025, 09:23 AM

To me what's most interesting about this is that it was ultimately USA's restriction on chips that ended up sparking the necessity of China finding a different way around the problem. Life finds a way.

The Yen Man · 01-28-2025, 10:38 AM

I totally get questioning the actual $6.5M price tag, but even if it was say 10X that price, that's still way more efficient than what current AI models were saying it would cost.

China new exactly what they were doing putting this out as open source. They wanted to make sure their model was accessible to anyone, and screw over the big dog US tech companies as much as possible. They basically threw a grenade into the US tech stock market yesterday.

Let's see if this is a temporary blip or something permanent. I'll be the first to admit I don't know enough about AI to know how serious of a threat this is to the US tech giants. But anything that disrupts all those mega corps is ok in my books. They are getting way too big in size and influence.

Fuzz · 01-28-2025, 10:53 AM

There is also no reason the other models can't adopt this technique, and still leverage their extra power and model sizes. Will be interesting to see if that is another step improvement.

BoLevi · 01-28-2025, 11:32 AM

I think people are overstating the geopolitical aspects of this.

AI's biggest innovation will be getting commoditization. The Chinese are pretty good at that.