Published 2 days ago • loading... • Updated 1 day ago

DeepSeek’s updated R1 AI model matches Google, Anthropic in coding benchmark

Summary by South China Morning Post

The updated reasoning model, released in May, performed well against leading US models in the real-time WebDev Arena tests.

5 Articles

All

Left

Center

Right

South China Morning Post

Center

DeepSeek’s updated R1 AI model matches Google, Anthropic in coding benchmark

The updated reasoning model, released in May, performed well against leading US models in the real-time WebDev Arena tests.

2 days ago·Hong Kong

Read Full Article

El Cronista

DeepSeek Shakes Silicon Valley and Reveals the True Chinese AI Muscle

The founder of DeepSeek - the Chinese startup that recently altered the world of artificial intelligence - is prone to prolonged silences and faltering statements, but his new employees quickly learn not to confuse with shyness those quiet cavilations. Once Liang processes the highlights of a discussion, he shoots precise and difficult questions about model architecture, computer costs and other complexities of DeepSeek's AI systems. Employees c…

1 day ago·Argentina

Read Full Article

Technology in Business

How DeepSeek Rewrote the Transformer

How DeepSeek Rewrote the Transformer [MLA] How DeepSeek Rewrote the Transformer [MLA] Technical Notes 1. Note that DeepSeek-V2 paper claims a KV cache size reduction of 93.3%. They don’t exactly publish their methodology, but as far as I can tell it’s something likes this: start with Deepseek-v2 hyperparameters here: https://huggingface.co/deepseek-ai/DeepSeek-V2/blob/main/configuration_deepseek.py. num_hidden_layers=30, num_attention_heads=32, …

1 day ago

Read Full Article

Startups Magazine | Digital & Print Magazine For Tech Startups

The future of AI in business: shifting from automation to autonomy

The year kicked off with an earthquake rattling the foundations of AI. Deepseek’s sudden arrival demonstrated there are multiple avenues in the advancement of super intelligence and drew the movers and shakers to weigh up software/algorithms versus banking on increases of model complexity or computational powers.

1 day ago

Read Full Article

Sify

DeepSeek’s R1 upgrade takes on GPT-4 with some “rumoured” help from Gemini’s brain!

While China’s most ambitious open-source model may have been quietly fed by one of its Western rivals, if the product is an Open Source LLM better than GPT-4, does anyone really care? A couple of months ago we posted about DeepSeek training AI for pennies on the dollar and then another one about an entire [...] The post DeepSeek’s R1 upgrade takes on GPT-4 with some “rumoured” help from Gemini’s brain! appeared first on Sify.

2 days ago

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year