The A.I. Thread - Page 33 - Calgarypuck Forums - The Unofficial Calgary Flames Fan Community

kermitology · 04-04-2025, 10:33 AM

Quote:

Originally Posted by Bill Bumface

I think the senior engineer question is the important one here. AI can be an exceptional productivity booster if you have the strong software engineering principles and code review skills to prompt it/tweak into something of high quality.

I foresee companies relying on super productive senior engineers for a while, and then as they age out, there is going to be a huge hole to fill where juniors weren't given the space to do things slower themselves and develop into seniors.

My company has embraced GitHub Copilot and I've been pretty bearish on it, and for pretty much this exact reason. I talk to my team and say that these are tools that can be helpful, much like using third party libraries are helpful, but ultimately you need to understand what they're doing. AI can't understand intent, and it can't do creative intent. It's foolish to lean entirely on AI.

I'm also pretty bearish on the entire AI industry given it's god awful impact on the environment, it's accelerating costs, and the fact that we're using models to train models which seems a lot like making photo copies of photo copies. Then you've got companies like CoreWeave who are doing Lazy Susan funding with Nvidia, and OpenAI getting funding from Softbank that just seems down right fraudulent, and OpenAI is heavily dependent on CoreWeave.

I'd be wary of trying to build businesses on the back of AI.

Russic · 04-04-2025, 11:09 AM

The environmental impact is a really interesting issue. As the models have gotten more efficient, the water usage has plummeted. From an environmental perspective, eating a single burger is probably worse than an individual's ChatGPT usage over their lifetime. But of course we're going to be using it an order of magnitude more.

I do think once you align all the environmental impacts, AI ends up dwarfed by most human activities. It's a conundrum really... if AI helps advance technology to the point we can solve an issue like factory farming, solar powered robots stroll the countryside and waterways picking up our trash, and we engineer proteins that break down plastics in the soil, then suddenly the environmental impact of AI tips the other way.

The "models to train models" thing is among the most interesting things to me. I think the finding that it actually seems to work was an important turning point in the debate where a lot of people who were pessimistic began to change their minds.

missdpuck · 04-07-2025, 08:06 PM

Quote:

Originally Posted by Fuzz

I had posted about this a couple weeks ago, now there is some research into Russia poisoning AI data sources.

https://www.enterprisesecuritytech.c...tern-ai-models

Russia is winning a war we didn't even know was happening.

Interesting. I haven’t thought about Dougan in years. I remember reading his gossip forum about the local LEO departments, and according to deputies and other LEOs that I spoke to, a good deal of the information there was true. PBSO is known for being corrupt, as is BSO/Broward County Sheriff’s Department (Ft Lauderdale area). That combination of truth and lies made for a successful disinformation campaign. Some of the stuff on there was truly ridiculous. Who knew he was honing his skills for something far more damaging.

He did keep the site up for a bit after defecting to Russia. He used to use the name Badvolf.

A shame he ended up using his talents for misinformation bullcrap.

Rick Bradshaw, a Democrat , has been the Sheriff of Palm Beach county for years. Most of Dougan’s ire was directed towards him and Mike Gauger. At the time I didn’t realize what a disinformation site Dougan had going on. Makes you wonder if Trump had anything to do with his defection. I think he went to Russia just before the election in 2016.

The gossip forum was PBSOTalk.com. After he defected it was changed to PBSOTalk.ru

Fuzz · 04-12-2025, 10:35 AM

Quote:

AI coding assistants, like large language models in general, have a habit of hallucinating. They suggest code that incorporates software packages that don't exist.

As we noted in March and September last year, security and academic researchers have found that AI code assistants invent package names. In a recent study, researchers found that about 5.2 percent of package suggestions from commercial models didn't exist, compared to 21.7 percent from open source models.

Quote:

"The problem is, these code suggestions often include hallucinated package names that sound real but don’t exist. I’ve seen this firsthand. You paste it into your terminal and the install fails – or worse, it doesn’t fail, because someone has slop-squatted that exact package name."

Aboukhadijeh said these fake packages can look very convincing.

"When we investigate, we sometimes find realistic looking READMEs, fake GitHub repos, even sketchy blogs that make the package seem authentic," he said, adding that Socket's security scans will catch these packages because they analyze the way the code works.

"Even worse, when you Google one of these slop-squatted package names, you’ll often get an AI-generated summary from Google itself confidently praising the package, saying it’s useful, stable, well-maintained. But it’s just parroting the package’s own README, no skepticism, no context. To a developer in a rush, it gives a false sense of legitimacy.

"What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful."

https://www.theregister.com/2025/04/..._supply_chain/

AI is just way too easy to poison. You put the fake stuff out there, and it doesn't validate anything it gathers. Inexperienced coders happily assemble their product, unaware the dependency they used could be poisoning their databases, harvesting user info, installing any type of malware, and anything else you could dream up.

cral12 · 04-13-2025, 10:31 AM

Good podcast with Perplexity AI founder and Rick Rubin:

https://www.tetragrammaton.com/content/aravind-srinivas

Russic · 04-14-2025, 09:55 AM

Quote:

Originally Posted by Fuzz

https://www.theregister.com/2025/04/..._supply_chain/

AI is just way too easy to poison. You put the fake stuff out there, and it doesn't validate anything it gathers. Inexperienced coders happily assemble their product, unaware the dependency they used could be poisoning their databases, harvesting user info, installing any type of malware, and anything else you could dream up.

It'll be very interesting to see if agents are at all helpful in situations like this. If you have a dedicated ai focused solely on something like security, could that improve things, or would it be easy to trick?

Conceivably I could see a system where you go to make a change and your eager-to-help ai is interrupted by your security ai that won't let them move ahead with their helpful-but-insanely-foolish edit.

Fuzz · 04-14-2025, 10:20 AM

Ya, that's a possibility, but given how we've seen other infiltration techniques evolve over the years, I don't have a lot of confidence security AI bots couldn't also be tricked.

That's also one of those things that sounds easy enough, but when you think about it becomes very complicated. Beginning with secure principles like "the only secure system is one not on the internet" means your AI security bot won't let you make anything useful. So you loosen that requirement, and then, well, what modules are approved to be used, and known to be secure and safe? None, because they are only as secure as we think them to be, until someone finds a hole. So now you need to go with a mushy definition, and I think as you work through coding projects the security features would just have to be turned down until you get a functioning product.

I don't think this is a necesarily "never" idea but I'm not sure how reasonable and effective it would be in our current world of IT, which any honest security expert will tell you is a minefield of disasters waiting to explode.

psyang · 04-14-2025, 10:54 AM

Quote:

Originally Posted by Fuzz

https://www.theregister.com/2025/04/..._supply_chain/

AI is just way too easy to poison. You put the fake stuff out there, and it doesn't validate anything it gathers. Inexperienced coders happily assemble their product, unaware the dependency they used could be poisoning their databases, harvesting user info, installing any type of malware, and anything else you could dream up.

Ran into this last week - but it wasn't poisoning of the dataset.

Was using a powershell api to powerbi, and wanted to see if there was a method to remove a power bi dataset.

The api is of the form:
<verb>-PowerBi<object>

So, like remove-PowerBiReport, or get-PowerBiDataset

I wanted to see if there was a remove-PowerBiDataset method. Doing a google search, the google api reported the following:

Unfortunately, there actually is no remove-PowerBIDataset method in the powershell api. AI inferred it should exist, what it should look like, and what parameters should be used! This wasn't because someone poisoned the dataset, but because the method naming scheme makes it easy for AI to figure out what a likely (non-existent) method should be called.

Firebot · 04-14-2025, 11:01 AM

Quote:

Originally Posted by Russic

It'll be very interesting to see if agents are at all helpful in situations like this. If you have a dedicated ai focused solely on something like security, could that improve things, or would it be easy to trick?

Conceivably I could see a system where you go to make a change and your eager-to-help ai is interrupted by your security ai that won't let them move ahead with their helpful-but-insanely-foolish edit.

MCP servers can theoretically do this (the next evolution of agents). MCP (model context protocol) which was created by Anthropic has taken the AI world by storm in recent months, even Open AI is supporting it as a go forward.

GitHub has it's own MCP server and with the implementation of MCP for LLM to communicate directly with applications in a code that it understands, expect security companies to develop / sell services to actively scan for bad or vulnerable code via an MCP server. Of course...MCP has its own inherent risk...

https://github.com/modelcontextprotocol

Some are already coming out, likely Socket will provide this soon

https://invariantlabs.ai/blog/introducing-mcp-scan

MCP is the new gold rush of sorts. I have my own setup with multiple MCP servers built via Claude Desktop that I now use for coding which has direct access to some of my directories as well as web search. I built through it a full AI video generation pipeline that creates a 30 second video with voiceover from a generated script via APIs using each provider's API documentation (ex: Gemini Pro 2.5, Hailuaoi, ElevenLabs), joins all the segments and music together to make a fully cohesive video and post it to Youtube for example for a truly faceless automated channel.

Kling 2.0 is coming out tomorrow. I was actually recruited to make a number of sponsored videos for it before the reveal with early access to it, but could not finalize price negotiation in time for this specific campaign. Ugh, well next time. Things are going extremely well and progressing far faster than I could have dreamed.

Fuzz · 04-14-2025, 11:24 AM

Quote:

Originally Posted by psyang

Ran into this last week - but it wasn't poisoning of the dataset.

Was using a powershell api to powerbi, and wanted to see if there was a method to remove a power bi dataset.

The api is of the form:
<verb>-PowerBi<object>

So, like remove-PowerBiReport, or get-PowerBiDataset

I wanted to see if there was a remove-PowerBiDataset method. Doing a google search, the google api reported the following:

Unfortunately, there actually is no remove-PowerBIDataset method in the powershell api. AI inferred it should exist, what it should look like, and what parameters should be used! This wasn't because someone poisoned the dataset, but because the method naming scheme makes it easy for AI to figure out what a likely (non-existent) method should be called.

Ya, I've come across stuff like that. It's also really good at taking stuff from other products and applying it to the product you asked about when it can't find information specifically for your model/version. Like if you have a Fitbit Sense it'll give you instructions that work for the Fitbit Charge, but are useless on the Sense. This is super common.

psyang · 04-14-2025, 11:41 AM

Quote:

Originally Posted by Fuzz

Ya, I've come across stuff like that. It's also really good at taking stuff from other products and applying it to the product you asked about when it can't find information specifically for your model/version. Like if you have a Fitbit Sense it'll give you instructions that work for the Fitbit Charge, but are useless on the Sense. This is super common.

I remember in the early days of AI, reading this article about how a user was able to create a virtual file system, and even get at websites in the alt-internet.

The line between "inferring new information" to further knowledge and just "making stuff up" is critical, but so far also too ambiguous to properly define.

kermitology · 04-14-2025, 12:06 PM

I continually come back to the fundamental issue of trust. I cannot in good faith trust what AI comes up with, and reliance on it without having a solid understanding already is dangerous.

LLMs are syntax generators, they have no concept of truth.

Fuzz · 04-19-2025, 09:26 AM

Quote:

According to OpenAI’s internal tests, o3 and o4-mini, which are so-called reasoning models, hallucinate more often than the company’s previous reasoning models — o1, o1-mini, and o3-mini — as well as OpenAI’s traditional, “non-reasoning” models, such as GPT-4o.

Perhaps more concerning, the ChatGPT maker doesn’t really know why it’s happening.
In its technical report for o3 and o4-mini, OpenAI writes that “more research is needed” to understand why hallucinations are getting worse as it scales up reasoning models. O3 and o4-mini perform better in some areas, including tasks related to coding and math. But because they “make more claims overall,” they’re often led to make “more accurate claims as well as more inaccurate/hallucinated claims,” per the report.

OpenAI found that o3 hallucinated in response to 33% of questions on PersonQA, the company’s in-house benchmark for measuring the accuracy of a model’s knowledge about people. That’s roughly double the hallucination rate of OpenAI’s previous reasoning models, o1 and o3-mini, which scored 16% and 14.8%, respectively. O4-mini did even worse on PersonQA — hallucinating 48% of the time.

https://techcrunch.com/2025/04/18/op...lucinate-more/

It's making #### up almost half the time. Not great.

OldDutch · 04-19-2025, 10:14 AM

Quote:

Originally Posted by kermitology

My company has embraced GitHub Copilot and I've been pretty bearish on it, and for pretty much this exact reason. I talk to my team and say that these are tools that can be helpful, much like using third party libraries are helpful, but ultimately you need to understand what they're doing. AI can't understand intent, and it can't do creative intent. It's foolish to lean entirely on AI.

I'm also pretty bearish on the entire AI industry given it's god awful impact on the environment, it's accelerating costs, and the fact that we're using models to train models which seems a lot like making photo copies of photo copies. Then you've got companies like CoreWeave who are doing Lazy Susan funding with Nvidia, and OpenAI getting funding from Softbank that just seems down right fraudulent, and OpenAI is heavily dependent on CoreWeave.

I'd be wary of trying to build businesses on the back of AI.

History repeats itself. I was paid well to quit college and program Javascript and HTML during the dot com boom. This feels oddly similar with “prompt engineers”. Both very young me and now, we had minimal training and experience but the money was flowing hard. Everyone needed a website.

Then Jan 2000 hit, and it deflated. Not in days, but months and years.

I think this will be the same. AI like Web isn’t going away. The hype and shine will, and it will then morph into integrating with our lives. It will get much better, with less hype, but as always people will lose their shirts. Except for the new Mark Cuban, can’t wait to see who that is. Or maybe not.

missdpuck · 04-19-2025, 10:33 AM

Quote:

Originally Posted by kermitology

I continually come back to the fundamental issue of trust. I cannot in good faith trust what AI comes up with, and reliance on it without having a solid understanding already is dangerous.

LLMs are syntax generators, they have no concept of truth.

Yes. I asked ChatGPT 2 questions the other day; both answers were incorrect.

Wormius · 04-19-2025, 10:47 AM

I wish it would at least respond with, “I don’t know how to figure this out” rather than wasting my time having to read the convoluted BS it comes up with when answering a question.

Russic · 04-19-2025, 11:54 AM

I have a pet theory that hallucinations can be mitigated by agents in a sort of "peer-review" session. To test this I had ChatGPT give me a breakdown of the Flames 2004 cup run, then I handed that off to a separate LLM and told them to sweep for inaccuracies.

It looks like the original output got the following wrong:

- It said Kipper set a record for GAA (it was a "modern day" record)
- It said we won 3 in OT vs Detroit (we won 2)

The second LLM did actually catch both errors and corrected them... so in a weird way this did sorta work. Of course it couldn't be less scientific and I can't be 100% certain it didn't miss things. However, it did patch 2 gaping holes.

Fuzz · 04-19-2025, 12:01 PM

I suspect some of it is due to one of the fundamental parts of training, the reward mechanism. The models are trying to get things right, and if they can make data up that makes them "correct" they then think that is success. They are more eager to provide an answer than not, because that is their job.

If that is part of the issue, they would have to find away to lower the drive to give any answer, and raise the penalty for giving incorrect ones, such that the model is more likely to say it doesn't know. It's the balance between thoughtfully assembling low availability of information and intuiting what the answer might be vs saying "I don't have enough information to give you a reasonable answer to that."

Firebot · 04-23-2025, 01:41 PM

Interesting article from Anthropic on malicious use and how prevalent it is becoming, and to what ends.

https://www.anthropic.com/news/detec...ude-march-2025

Shazam · 04-23-2025, 02:03 PM

Quote:

Originally Posted by TorqueDog

Claude got me started on a Python app I wanted to create, but it was ChatGPT (o3-mini-high) that actually got the damn thing to run properly and generated the desired output.

I wanted a program that would run automatically at 9 AM ET, gather all the post-market activity from the previous day and pre-market activity from the current day, and then plot the price action, high-low, levels of resistance and support, using the 1D, 4H, and 30M, for ES and NQ futures, plus the Magnificent 7. Put them each into their own chart covering just the extended market for that time period.

Claude got it started but kept making the date range too long (plus had some syntax bugs). After a few "fix your code, here's the error" prompts, Claude's version couldn't actually produce anything and would error out when accessing yfinance. So I took the initial "working but flawed" Claude version and put it into ChatGPT and had it slowly correct each bug at a time, plus create a verbose logging mode for better troubleshooting of code issues.

It now works perfectly. I was pretty blown away.

But see... that's not very useful. Oh sure, useful to you. But it's just another contact manager. It has zero value. Like, could you sell it? Like, is someone going to pay $200/mth to create essentially one-off apps?