Wednesday, April 30, 2025, 8:20 pm
A recent study alleges that LM Arena—the team behind the popular Chatbot Arena—has manipulated its benchmark evaluations to favor select AI labs. Critics argue this approach undermines the fairness of widely recognized scoring methods, fueling demands for greater transparency and accountability in the process.
Bluesky: @techcrunch.com, @techcrunch.com
Google’s AdSense advertising network started supporting ads inside users’ chats with some third-party AI chatbots earlier this year, Bloomberg reported. The company is rolling out the feature following tests with AI search startups iAsk and Liner, the report said, citing anonymous sources familiar…
You also mentioned the whole Chatbot Arena thing, which I think is interesting and points to the challenge around how you do benchmarking. How do you know what models are good for which things? One of the things we've generally tried to do over the last year is anchor more of our models in our Meta…
A new paper from AI lab Cohere, Stanford, MIT, and Ai2 accuses LM Arena, the organization behind the popular crowdsourced AI benchmark Chatbot Arena, of helping a select group of AI companies achieve better leaderboard scores at the expense of rivals. According to the authors, LM Arena allowed some…
The Chatbot Arena has become the go-to place for vibes-based evaluation of LLMs over the past two years. The project, originating at UC Berkeley, is home to a large community of model enthusiasts who submit prompts to two randomly selected anonymous models and pick their favorite response. This…
permalink / 4 stories from sources in 9 days ago #ai #openai #aiethics #chatgpt
Google has agreed to a settlement of roughly $1.375 billion to settle allegations of invasive data tracking practices in Texas. Both reports detail the breach of users’ sensitive geolocation and privacy rights, marking another hefty reminder that digital privacy demands serious oversight—even if enforcement sometimes seems as elusive as a mirage. More...
Facing escalating U.S. tariffs, Apple appears to be quietly shifting iPhone production to Brazil through its Foxconn partnership. Despite Apple’s denials, industry insiders suggest this strategic relocation aims to stabilize prices and secure the supply chain, proving that when it comes to international trade, geography often trumps rhetoric. More...
Samsung has kicked off a high-profile unveiling of its long-anticipated superslim Galaxy S25 Edge, sparking excitement among tech enthusiasts. The event showcased the phone’s innovative design and cutting-edge features, setting the stage for a fiercely competitive market. Not exactly a revolution, but enough to make followers drool. More...
HR tech startup Rippling has astonished investors by reaching a $16.8 billion valuation following a $450 million Series G financing round. This meteoric rise, buoyed by savvy venture capital interest (YC reportedly being one of the backers), adds another chapter to today’s ever-expanding unicorn saga. More...
U.S. Customs and Border Protection is set to capture every face of drivers leaving the country with high-tech facial recognition that matches travelers against their passports. The agency’s ambitious plan has security officials applauding while privacy advocates raise an eyebrow at this unprecedented surveillance move. More...
LegoGPT AI Brings Realistic, Stable Lego Creations to Life (15 hours ago)
Court Uses AI Deepfake for Deceased Victim's Impact Statement (19 hours ago)
Google Gemini API introduces caching to slash developer costs (20 hours ago)
ChatGPT deep research tool now talks GitHub (39 hours ago)
Instacart CEO Joins OpenAI for Strategic Application Leadership (2 days ago)
OpenAI-FDA Talks Boost AI’s Role in Drug Evaluations (2 days ago)
Court Uses AI Deepfake for Deceased Victim's Impact Statement (19 hours ago)
Google Gemini API introduces caching to slash developer costs (20 hours ago)
Bill Gates accuses Elon Musk of harming vulnerable children (46 hours ago)
ChatGPT deep research tool now talks GitHub (39 hours ago)
Netflix Unveils AI-Powered Chatbot for Smarter Content Discovery (3 days ago)
OpenAI Scrambles After ChatGPT Update Testing Fumble (4 days ago)
Disclaimer: The information provided on this website is intended for general informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the content. Users are encouraged to verify all details independently. We accept no liability for errors, omissions, or any decisions made based on this information.