Cursor's Third Era: Cloud Agents

Третья эра Cursor: облачные агенты

All speakers are announced at AIE EU, schedule coming soon. Join us there or in Miami with the renowned organizers of React Miami! Singapore CFP also open!

Все спикеры объявлены на AIE EU, расписание скоро будет. Присоединяйтесь к нам там или в Miami с известными организаторами React Miami! Singapore CFP тоже открыт!

We’ve called this out a few times over in AINews, but the overwhelming consensus in the Valley is that “the IDE is Dead”. In November it was just a gut feeling, but now we actually have data: even at the canonical “VSCode Fork” company, people are officially using more agents than tab autocomplete (the first wave of AI coding):

Мы уже несколько раз отмечали это в AINews, но в Долине сложился подавляющий консенсус: «IDE мертва». В ноябре это было лишь интуитивное ощущение, а теперь у нас есть данные: даже в каноничной компании-«форке VSCode» люди официально используют агентов больше, чем tab-автокомплит (первая волна AI-кодинга):

Cursor has launched cloud agents for a few months now, and this specific launch is around Computer Use, which has come a long way since we first talked with Anthropic about it in 2024, and which Jonas productized as Autotab:

Cursor запустил облачных агентов уже несколько месяцев назад, и конкретно этот запуск посвящён Computer Use, который сильно продвинулся с тех пор, как мы впервые говорили об этом с Anthropic в 2024 году, и который Jonas продуктизировал под именем Autotab:

The new Claude 3.5 Sonnet, Computer Use, and Building SOTA Agents — with Erik Schluntz, Anthropic

Новый Claude 3.5 Sonnet, Computer Use и создание SOTA-агентов — с Erik Schluntz, Anthropic

We have announced our first speaker, friend of the show Dylan Patel, and topic slates for Latent Space LIVE! at NeurIPS. Sign up for IRL/Livestream and to debate!

Мы объявили первого спикера, друга шоу Dylan Patel, и набор тем для Latent Space LIVE! на NeurIPS. Регистрируйтесь на офлайн/трансляцию и приходите спорить!

We also take the opportunity to do a live demo, talk about slash commands and subagents, and the future of continual learning and personalized coding models, something that Sam previously worked on at New Computer. (The fact that both of these folks are top tier CEOs of their own startups that have now joined the insane talent density gathering at Cursor should also not be overlooked).

Также мы пользуемся возможностью провести живую демонстрацию, поговорить о slash-командах и subagents, а также о будущем continual learning и персонализированных моделей для кодинга — то, над чем Sam ранее работал в New Computer. (Тот факт, что оба этих человека — топовые CEO собственных стартапов, которые теперь присоединились к безумной концентрации талантов в Cursor, тоже не стоит упускать из виду.)

Full Episode on YouTube!

Полный эпизод на YouTube!

please like and subscribe!

пожалуйста, поставьте лайк и подпишитесь!

Timestamps

Таймкоды

00:00 Agentic Code Experiments
00:53 Why Cloud Agents Matter
02:08 Testing First Pillar
03:36 Video Reviews Second Pillar
04:29 Remote Control Third Pillar
06:17 Meta Demos and Bug Repro
13:36 Slash Commands and MCPs
18:19 From Tab to Team Workflow
31:41 Minimal Web UI Philosophy
32:40 Why No File Editor
34:38 Full Stack Cursor Debate
36:34 Model Choice and Auto Routing
38:34 Parallel Agents and Best Of N
41:41 Subagents and Context Management
44:48 Grind Mode and Throughput Future
01:00:24 Cloud Agent Onboarding and Memory

00:00 Эксперименты с агентным кодом 00:53 Почему облачные агенты важны 02:08 Первый столп: тестирование 03:36 Второй столп: видео-ревью 04:29 Третий столп: удалённое управление 06:17 Мета-демки и воспроизведение багов 13:36 Slash-команды и MCP 18:19 От tab к командной работе 31:41 Философия минимального веб-UI 32:40 Почему нет файлового редактора 34:38 Спор о full stack Cursor 36:34 Выбор модели и авто-роутинг 38:34 Параллельные агенты и Best-of-N 41:41 Subagents и управление контекстом 44:48 Grind-режим и будущее throughput 01:00:24 Онбординг облачного агента и память

Transcript

Транскрипт

EP 77 - CURSOR - Audio version

EP 77 — CURSOR — аудио-версия

[00:00:00]

Agentic Code Experiments

Эксперименты с агентным кодом

Samantha: This is another experiment that we ran last year and didn’t decide to ship at that time, but may come back to LM Judge, but one that was also agentic and could write code. So it wasn’t just picking but also taking the learnings from two models or and models that it was looking at and writing a new diff.

Samantha: Это ещё один эксперимент, который мы запустили в прошлом году и тогда решили не релизить, но возможно вернёмся к LLM Judge — только этот тоже был агентным и мог писать код. То есть он не просто выбирал, но и брал выводы из двух или N моделей, на которые смотрел, и писал новый diff.

And what we found was that there were strengths to using models from different model providers as the base level of this process. Basically you could get almost like a synergistic output that was better than having a very unified like bottom model tier.

И что мы обнаружили — есть преимущества от использования моделей разных провайдеров в качестве базового уровня этого процесса. По сути, ты получаешь почти синергетический выход, который лучше, чем если бы у тебя был очень унифицированный нижний слой моделей.

Jonas: We think that over the coming months, the big unlock is not going to be one person with a model getting more done, like the water flowing faster and we’ll be making the pipe much wider and so paralyzing more, whether that’s swarms of agents or parallel agents, both of those are things that contribute to getting much more done in the same amount of time.

Jonas: Мы считаем, что в ближайшие месяцы большой прорыв будет не в том, что один человек с моделью успевает больше — то есть вода течёт быстрее. Мы будем делать саму трубу гораздо шире, чтобы паралеллизировать больше: будь то рои агентов или параллельные агенты — и то и другое способствует тому, чтобы успевать гораздо больше за то же время.

Why Cloud Agents Matter

Почему облачные агенты важны

swyx: This week, one of the biggest launches that Cursor’s ever done is cloud agents. I think you, you had [00:01:00] cloud agents before, but this was like, you give cursor a computer, right? Yeah. So it’s just basically they bought auto tab and then they repackaged it. Is that what’s going on, or,

swyx: На этой неделе один из крупнейших запусков, что Cursor когда-либо делал — это облачные агенты. У вас [00:01:00] облачные агенты уже были, но здесь ты по сути даёшь Cursor компьютер, верно? Да. То есть они по сути купили Autotab и просто переупаковали его. Так оно или…

Jonas: that’s a big part of it.

Jonas: это большая часть.

Yeah. Cloud agents already ran in their own computers, but they were sort of site reading code. Yeah. And those computers were not, they were like blank VMs typically that were not set up for the Devrel X for whatever repo the agents working on. One of the things that we talk about is if you put yourself in the model shoes and you were seeing tokens stream by and all you could do was cite read code and spit out tokens and hope that you had done the right thing,

Да. Облачные агенты уже работали в собственных компьютерах, но они скорее читали код. Да. И эти компьютеры не были подготовлены — это были, как правило, пустые VM, не настроенные под DevEx или под конкретный репозиторий, с которым работает агент. Одна из вещей, о которых мы говорим: если поставить себя на место модели — ты видишь поток токенов, и всё, что ты можешь сделать, это читать код и выдавать токены, надеясь, что ты сделал правильно,

swyx: no chance

swyx: ни единого шанса

Jonas: I’d be so bad.

Jonas: Я бы был так плох.

Like you obviously you need to run the code. And so that I think also is probably not that contrarian of a take, but no one has done that yet. And so giving the model the tools to onboard itself and then use full computer use end-to-end pixels in coordinates out and have the cloud computer with different apps in it is the big unlock that we’ve seen internally in terms of use usage of this going from, oh, we use it for little copy changes [00:02:00] to no.

Очевидно же, что нужно запускать код. И мне кажется, это тоже не сильно контр-интуитивный тейк, но никто этого ещё не делал. И вот дать модели инструменты, чтобы она сама себя онбордила, а потом использовала полный computer use end-to-end — пиксели на входе, координаты на выходе — и облачный компьютер с разными приложениями внутри: это и есть тот большой прорыв, который мы внутри увидели в плане использования. Раньше мы использовали это для мелких правок копирайта [00:02:00] — а теперь нет.

We’re really like driving new features with this kind of new type of entech workflow. Alright, let’s see it. Cool.

Мы реально запускаем новые фичи через этот новый тип агентного воркфлоу. Окей, давайте посмотрим. Круто.

Live Demo Tour

Тур по живой демке

Jonas: So this is what it looks like in cursor.com/agents. So this is one I kicked off a while ago. So on the left hand side is the chat. Very classic sort of agentic thing. The big new thing here is that the agent will test its changes.

Jonas: Вот как это выглядит на cursor.com/agents. Этого я запустил какое-то время назад. Слева — чат. Очень классическая агентная штука. Главное новое здесь — то, что агент будет тестировать свои изменения.

So you can see here it worked for half an hour. That is because it not only took time to write the tokens of code, it also took time to test them end to end. So it started Devrel servers iterate when needed. And so that’s one part of it is like model works for longer and doesn’t come back with a, I tried some things pr, but a I tested at pr that’s ready for your review.

Видишь, он работал полчаса. Это потому, что он не только тратил время на написание токенов кода, но и тратил время на end-to-end тестирование. Он стартовал dev-серверы, итерировал когда нужно. И это одна часть — модель работает дольше и возвращается не с PR «я попробовал что-то», а с PR «я протестировал, готов к ревью».

One of the other intuition pumps we use there is if a human gave you a PR asked you to review it and you hadn’t, they hadn’t tested it, you’d also be annoyed because you’d be like, only ask me for a review once it’s actually ready. So that’s what we’ve done with

Один из других intuition pump, которые мы используем: если бы человек дал тебе PR и попросил отревьюить, но сам не протестировал — ты бы тоже разозлился, потому что сказал бы: «Проси меня о ревью, только когда оно реально готово». Так мы и сделали с

Testing Defaults and Controls

Дефолты тестирования и контроль

swyx: simple question I wanted to gather out front.

swyx: простой вопрос, который я хотел задать сразу.

Some prs are way smaller, [00:03:00] like just copy change. Does it always do the video or is it sometimes,

Некоторые PR гораздо меньше, [00:03:00] просто правка копирайта. Он всегда делает видео или иногда…

Jonas: Sometimes.

Jonas: Иногда.

swyx: Okay. So what’s the judgment?

swyx: Окей. А по какому критерию?

Jonas: The model does it? So we we do some default prompting with sort. What types of changes to test? There’s a slash command that people can do called slash no test, where if you do that, the model will not test,

Jonas: Модель сама решает. У нас есть дефолтные подсказки про то, какие типы изменений тестировать. Есть slash-команда /no test, и если ты её используешь — модель не будет тестировать,

swyx: but the default is test.

swyx: но по дефолту — тест.

Jonas: The default is to be calibrated. So we tell it don’t test, very simple copy changes, but test like more complex things. And then users can also write their agents.md and specify like this type of, if you’re editing this subpart of my mono repo, never tested ‘cause that won’t work or whatever.

Jonas: По дефолту — откалибровано. То есть мы говорим ей: не тестируй очень простые правки копирайта, но тестируй более сложное. И пользователи также могут писать свой agents.md и указывать: если ты редактируешь вот эту подсистему моего моно-репо, никогда не тестируй — потому что это всё равно не сработает, например.

Videos and Remote Control

Видео и удалённое управление

Jonas: So pillar one is the model actually testing Pillar two is the model coming back with a video of what it did.

Jonas: Столп первый — модель реально тестирует. Столп второй — модель возвращается с видео того, что она сделала.

We have found that in this new world where agents can end-to-end, write much more code, reviewing the code is one of these new bottlenecks that crop up. And so reviewing a video is not a substitute for reviewing code, but it is an entry point that is much, much easier to start with than glancing at [00:04:00] some giant diff.

Мы обнаружили, что в этом новом мире, где агенты могут end-to-end писать гораздо больше кода, ревью кода — одно из новых бутылочных горлышек. И ревью видео — это не замена ревью кода, но это точка входа, с которой гораздо проще начать, чем смотреть [00:04:00] на какой-то гигантский diff.

And so typically you kick one off you, it’s done you come back and the first thing that you would do is watch this video. So this is a, video of it. In this case I wanted a tool tip over this button. And so it went and showed me what that looks like in, in this video that I think here, it actually used a gallery.

Обычно ты запускаешь агента, он закончил, ты возвращаешься — и первое, что ты делаешь, это смотришь это видео. Вот, например, видео. В этом случае я хотел tooltip над кнопкой. И он пошёл и показал мне в этом видео, как это выглядит. Здесь, кажется, он реально использовал галерею.

So sometimes it will build storybook type galleries where you can see like that component in action. And so that’s pillar two is like these demo videos of what it built. And then pillar number three is I have full remote control access to this vm. So I can go heat in here. I can hover things, I can type, I have full control.

Иногда он будет собирать storybook-подобные галереи, где можно увидеть этот компонент в действии. Это столп номер два — демо-видео того, что он построил. А столп номер три — у меня есть полный удалённый доступ к этой VM. Я могу зайти прямо сюда, наводить курсор, печатать — у меня полный контроль.

And same thing for the terminal. I have full access. And so that is also really useful because sometimes the video is like all you need to see. And oftentimes by the way, the video’s not perfect, the video will show you, is this worth either merging immediately or oftentimes is this worth iterating with to get it to that final stage where I am ready to merge in.

То же самое для терминала. Полный доступ. И это тоже очень полезно, потому что иногда видео — это всё, что нужно увидеть. А часто видео не идеально — оно покажет тебе: стоит ли это либо сразу мерджить, либо часто стоит ли с этим поитерировать, чтобы довести до финальной стадии, когда я готов мерджить.

So I can go through some other examples where the first video [00:05:00] wasn’t perfect, but it gave me confidence that we were on the right track and two or three follow-ups later, it was good to go. And then I also have full access here where some things you just wanna play around with. You wanna get a feel for what is this and there’s no substitute to a live preview.

Я могу пройтись по другим примерам, где первое видео [00:05:00] не было идеальным, но дало уверенность, что мы на правильном пути — а через два-три follow-up всё стало готово. А ещё у меня здесь полный доступ — некоторые вещи просто хочется потрогать. Хочется понять, что это такое — и нет замены живому превью.

And the VNC kind of VM remote access gives you that.

VNC-доступ к удалённой VM это даёт.

swyx: Amazing What, sorry? What is VN. And

swyx: Потрясающе. Что, прости, что такое VNC?

Jonas: just the remote desktop. Remote desktop. Yeah.

Jonas: Просто удалённый рабочий стол. Remote desktop. Ага.

swyx: Sam, any other details that you always wanna call out?

swyx: Sam, есть какие-то ещё детали, которые ты всегда хочешь подсветить?

Samantha: Yeah, for me the videos have been super helpful. I would say, especially in cases where a common problem for me with agents and cloud agents beforehand was almost like under specification in my requests where our plan mode and going really back and forth and getting detailed implementation spec is a way to reduce the risk of under specification, but then similar to how human communication breaks down over time, I feel like you have this risk where it’s okay, when I pull down, go to the triple of pulling down and like running this branch locally, I’m gonna see that, like I said, this should be a toggle and you have a checkbox and like, why didn’t you get that detail?

Samantha: Да, для меня видео были супер-полезными. Особенно в случаях, где для меня типичной проблемой с агентами и облачными агентами раньше была почти что недоспецификация моих запросов — наш plan mode и долгие переписки, чтобы получить детальный implementation spec, это способ снизить риск недоспецификации, но всё равно похоже, что как у человеческой коммуникации есть деградация со временем, у тебя есть риск: окей, когда я скачаю — на этапе скачивания и локального запуска этой ветки я увижу, что вот это должен был быть тоггл, а у тебя чекбокс, и почему ты не уловил эту деталь?

And having the video up front just [00:06:00] has that makes that alignment like you’re talking about a shared artifact with the agent. Very clear, which has been just super helpful for me.

А когда видео есть сразу, [00:06:00] это просто создаёт согласование — ты говоришь об общем артефакте с агентом. Это очень понятно, и для меня это супер-полезно.

Jonas: I can quickly run through some other Yes. Examples.

Jonas: Я могу быстро пробежаться по другим. Да. Примерам.

Meta Agents and More Demos

Мета-агенты и больше демок

Jonas: So this is a very front end heavy one. So one question I was

Jonas: Вот этот очень фронтенд-тяжёлый. Один вопрос, который я

swyx: gonna say, is this only for front

swyx: хотел задать — это только для фронт-

Jonas: end?

Jonas: энда?

Exactly. One question you might have is this only for front end? So this is another example where the thing I wanted it to implement was a better error message for saving secrets. So the cloud agents support adding secrets, that’s part of what it needs to access certain systems. Part of onboarding that is giving access.

Именно. Один вопрос, который мог возникнуть: это только для фронтенда? Вот ещё пример: я хотел, чтобы он реализовал более понятное сообщение об ошибке при сохранении секретов. Облачные агенты поддерживают добавление секретов — это часть того, что нужно для доступа к определённым системам. Часть онбординга — это выдача доступа.

This is cloud is working on

Это cloud работает над

swyx: cloud agents. Yes.

swyx: облачными агентами. Да.

Jonas: So this is a fun thing is

Jonas: Это забавная штука

Samantha: it can get super meta. It

Samantha: Может получиться супер-мета. Это

Jonas: can get super meta, it can start its own cloud agents, it can talk to its own cloud agents. Sometimes it’s hard to wrap your mind around that. We have disabled, it’s cloud agents starting more cloud agents. So we currently disallow that.

Jonas: может стать супер-мета — он может запускать собственных облачных агентов, общаться с ними. Иногда сложно обхватить это умом. Мы отключили — это облачные агенты, запускающие новых облачных агентов. Так что пока запрещаем.

Someday you might. Someday we might. Someday we might. So this actually was mostly a backend change in terms of the error handling here, where if the [00:07:00] secret is far too large, it would oh, this is actually really cool. Wow. That’s the Devrel tools. That’s the Devrel tools. So if the secret is far too large, we.

Когда-нибудь — возможно. Когда-нибудь — возможно. Когда-нибудь — возможно. Здесь, на самом деле, изменение было в основном на бэкенде в плане обработки ошибки, где если [00:07:00] секрет был слишком большой, то — о, это вообще-то очень круто. Вау. Это DevTools. Это DevTools. Так вот, если секрет слишком большой, мы…

Allow secrets above a certain size. We have a size limit on them. And the error message there was really bad. It was just some generic failed to save message. So I was like, Hey, we wanted an error message. So first cool thing it did here, zero prompting on how to test this. Instead of typing out the, like a character 5,000 times to hit the limit, it opens Devrel tools, writes js, or to paste into the input 5,000 characters of the letter A and then hit save, closes the Devrel tools, hit save and gets this new gets the new error message.

Разрешаем секреты выше определённого размера. У нас есть лимит на размер. И сообщение об ошибке там было очень плохим — просто какое-то общее «failed to save». Я попросил: «Эй, нужно нормальное сообщение об ошибке». Первая крутая штука, которую он сделал: нулевые подсказки про то, как это тестировать. Вместо того чтобы напечатать символ A 5000 раз, чтобы упереться в лимит, он открывает DevTools, пишет JS, чтобы вставить во input 5000 символов A, потом нажимает Save, закрывает DevTools, жмёт Save и получает это новое сообщение об ошибке.

So that looks like the video actually cut off, but here you can see the, here you can see the screenshot of the of the error message. What, so that is like frontend backend end-to-end feature to, to get that,

Видео, похоже, оборвалось, но здесь видно скриншот сообщения об ошибке. Это end-to-end frontend-backend фича, чтобы получить

swyx: yeah.

swyx: да.

Jonas: And

Jonas: И

swyx: And you just need a full vm, full computer run everything.

swyx: И тебе просто нужна полная VM, полный компьютер, чтобы запускать всё.

Okay. Yeah.

Окей. Да.

Jonas: Yeah. So we’ve had versions of this. This is one of the auto tab lessons where we started that in 2022. [00:08:00] No, in 2023. And at the time it was like browser use, DOM, like all these different things. And I think we ended up very sort of a GI pilled in the sense that just give the model pixels, give it a box, a brain in a box is what you want and you want to remove limitations around context and capabilities such that the bottleneck should be the intelligence.

Jonas: Да. У нас были версии этого. Это один из уроков Autotab, который мы начали в 2022. [00:08:00] Нет, в 2023. И в то время это был browser use, DOM — все эти разные штуки. И я думаю, мы в конце концов стали довольно AGI-пилленными в смысле «просто дай модели пиксели, дай ей коробку — мозг в коробке — это то, что тебе нужно, и нужно убрать ограничения вокруг контекста и возможностей так, чтобы бутылочным горлышком был интеллект».

And given how smart models are today, that’s a very far out bottleneck. And so giving it its full VM and having it be onboarded with Devrel X set up like a human would is just been for us internally a really big step change in capability.

А учитывая, насколько умны модели сегодня, это очень далёкое бутылочное горлышко. И вот выдача ей полной VM и онбординг с настроенным DevEx — как у человека — для нас внутри стала реально большим скачком в возможностях.

swyx: Yeah I would say, let’s call it a year ago the models weren’t even good enough to do any of this stuff.

swyx: Да, я бы сказал — где-то год назад модели были даже недостаточно хороши, чтобы делать что-то из этого.

So

Так что

Samantha: even six months ago. Yeah.

Samantha: даже полгода назад. Да.

swyx: So yeah what people have told me is like round about Sonder four fire is when this started being good enough to just automate fully by pixel.

swyx: Да, мне говорили, что примерно с выходом Sonnet 4 это стало достаточно хорошо, чтобы полностью автоматизировать по пикселям.

Jonas: Yeah, I think it’s always a question of when is good enough. I think we found in particular with Opus 4 5, 4, 6, and Codex five three, that those were additional step [00:09:00] changes in the autonomy grade capabilities of the model to just.

Jonas: Да, я думаю, это всегда вопрос — когда достаточно хорошо. В частности, с Opus 4.5, 4.6 и Codex 5.3 мы обнаружили, что это были дополнительные [00:09:00] скачки в способностях модели к автономности — просто

Go off and figure out the details and come back when it’s done.

уйти, разобраться в деталях и вернуться, когда готово.

swyx: I wanna appreciate a couple details. One 10 Stack Router. I see it. Yeah. I’m a big fan. Do you know any, I have to name the 10 Stack.

swyx: Хочу отметить пару деталей. Один — TanStack Router. Я его вижу. Да. Я большой фанат. Знаешь кого-нибудь, я должен назвать TanStack.

Jonas: No.

Jonas: Нет.

swyx: This just a random lore. Some buddy Sue Tanner. My and then the other thing if you switch back to the video.

swyx: Просто случайная лора. Какой-то чувак Sue Tanner. И ещё одна штука — если переключиться обратно на видео.

Jonas: Yeah.

Jonas: Да.

swyx: I wanna shout out this thing. Probably Sam did it. I don’t know

swyx: Хочу отметить эту штуку. Наверное, это сделала Sam. Не знаю

Jonas: the chapters.

Jonas: главы.

swyx: What is this called? Yeah, this is called Chapters. Yeah. It’s like a Vimeo thing. I don’t know. But it’s so nice the design details, like the, and obviously a company called Cursor has to have a beautiful cursor

swyx: Как это называется? Да, это называется Chapters. Да. Это что-то типа Vimeo. Не знаю. Но это так красиво, дизайн-детали — и очевидно, компания под названием Cursor должна иметь красивый курсор

Samantha: and it is

Samantha: и это

swyx: the cursor.

swyx: и есть курсор.

Samantha: Cursor.

swyx: You see it branded? It’s the cursor. Cursor, yeah. Okay, cool. And then I was like, I complained to Evan. I was like, okay, but you guys branded everything but the wallpaper. And he was like, no, that’s a cursor wallpaper. I was like, what?

swyx: Видишь брендинг? Это курсор. Курсор, ага. Окей, круто. И потом я такой — пожаловался Evan: «Окей, но вы забрендили всё, кроме обоев». А он: «Нет, это обои Cursor». Я такой: «Что?»

Samantha: Yeah. Rio picked the wallpaper, I think. Yeah. The video.

Samantha: Да. Обои выбрал Rio, кажется. Да. Видео.

That’s probably Alexi and yeah, a few others on the team with the chapters on the video. Matthew Frederico. There’s been a lot of teamwork on this. It’s a huge effort.

Это, наверное, Alexi, и да, ещё несколько из команды — с главами на видео. Matthew Frederico. Тут была большая командная работа. Это огромное усилие.

swyx: I just, I like design details.

swyx: Я просто люблю дизайн-детали.

Samantha: Yeah.

Samantha: Да.

swyx: And and then when you download it adds like a little cursor. Kind of TikTok clip. [00:10:00] Yes. Yes.

swyx: А когда скачиваешь, добавляется маленький курсор. Типа TikTok-клип. [00:10:00] Да. Да.

So it’s to make it really obvious is from Cursor,

Чтобы стало совсем очевидно, что это от Cursor,

Jonas: we did the TikTok branding at the end. This was actually in our launch video. Alexi demoed the cloud agent that built that feature. Which was funny because that was an instance where one of the things that’s been a consequence of having these videos is we use best of event where you run head to head different models on the same prompt.

Jonas: мы сделали TikTok-брендинг в конце. Это было в нашем launch-видео. Alexi показал облачного агента, который построил эту фичу. Что было забавно, потому что это был случай, когда одно из следствий наличия этих видео — мы используем best-of-N, когда запускаешь head-to-head разные модели на одном промте.

We use that a lot more because one of the complications with doing that before was you’d run four models and they would come back with some giant diff, like 700 lines of code times four. It’s what are you gonna do? You’re gonna review all that’s horrible. But if you come back with four 22nd videos, yeah, I’ll watch four 22nd videos.

Мы используем это гораздо чаще, потому что одна из сложностей раньше была в том, что ты запускаешь четыре модели, и они возвращаются с какими-то гигантскими diff’ами — типа по 700 строк кода, помноженных на четыре. Что ты будешь делать? Будешь всё это ревьюить? Это ужас. Но если они возвращаются с четырьмя 20-секундными видео — окей, я посмотрю четыре 20-секундных видео.

And then even if none of them is perfect, you can figure out like, which one of those do you want to iterate with, to get it over the line. Yeah. And so that’s really been really fun.

И даже если ни одно не идеально, ты можешь понять — с каким из них хочется поитерировать, чтобы довести до финала. Да. И это реально очень весело.

Bug Repro Workflow

Воркфлоу воспроизведения багов

Jonas: Here’s another example. That’s we found really cool, which is we’ve actually turned since into a slash command as well slash [00:11:00] repro, where for bugs in particular, the model of having full access to the to its own vm, it can first reproduce the bug, make a video of the bug reproducing, fix the bug, make a video of the bug being fixed, like doing the same pattern workflow with obviously the bug not reproducing.

Jonas: Вот ещё пример, который мы нашли очень классным — мы превратили его в slash-команду /[00:11:00]repro. Для багов в особенности: у модели есть полный доступ к собственной VM, она может сначала воспроизвести баг, снять видео с его воспроизведением, починить баг, снять видео с тем, что баг исправлен — делая тот же паттерн воркфлоу, только баг уже не воспроизводится.

And that has been the single category that has gone from like these types of bugs, really hard to reproduce and pick two tons of time locally, even if you try a cloud agent on it. Are you confident it actually fixed it to when this happens? You’ll merge it in 90 seconds or something like that.

И это та единственная категория, которая перешла от «такие баги очень тяжело воспроизвести, занимают тонны времени локально, даже если запускаешь облачного агента — ты не уверен, что он реально починил» — к «когда такое случается, ты замерджишь это за 90 секунд или около того».

So this is an example where, let me see if this is the broken one or the, okay, this is the fixed one. Okay. So we had a bug on cursor.com/agents where if you would attach images where remove them. Then still submit your prompt. They would actually still get attached to the prompt. Okay. And so here you can see Cursor is using, its full desktop by the way.

Вот пример — давай посмотрим, это сломанный или исправленный, окей, это исправленный. Окей. У нас был баг на cursor.com/agents: если ты прикрепляешь изображения, потом удаляешь их и всё равно отправляешь промт — они всё равно прикреплялись к промту. Окей. Здесь видно — Cursor использует свой полный десктоп.

This is one of the cases where if you just do, browse [00:12:00] use type stuff, you’ll have a bad time. ‘cause now it needs to upload files. Like it just uses its native file viewer to do that. And so you can see here it’s uploading files. It’s going to submit a prompt and then it will go and open up. So this is the meta, this is cursor agent, prompting cursor agent inside its own environment.

Это один из случаев, когда если делать просто [00:12:00] browser-use штуки — тебе придётся плохо. Потому что теперь ему нужно загружать файлы. Он просто использует нативный файловый viewer, чтобы это сделать. И видно, что он загружает файлы. Он собирается отправить промт, и потом откроет — это мета, это cursor agent промтит cursor agent внутри собственного окружения.

And so you can see here bug, there’s five images attached, whereas when it’s submitted, it only had one image.

И видно — баг: пять изображений прикреплено, тогда как при отправке было только одно изображение.

swyx: I see. Yeah. But you gotta enable that if you’re gonna use cur agent inside cur.

swyx: Понятно. Да. Но это нужно включить, если хочешь использовать Cursor-агента внутри Cursor.

Jonas: Exactly. And so here, this is then the after video where it went, it does the same thing. It attaches images, removes, some of them hit send.

Jonas: Именно. И вот здесь это видео «после» — он делает то же самое. Прикрепляет изображения, удаляет часть, жмёт Send.

And you can see here, once this agent is up, only one of the images is left in the attachments. Yeah.

И видно — когда этот агент поднят, в attachments осталось только одно изображение. Да.

swyx: Beautiful.

swyx: Красота.

Jonas: Okay. So easy merge.

Jonas: Окей. Лёгкий мердж.

swyx: So yeah. When does it choose to do this? Because this is an extra step.

swyx: Да. Когда он решает так сделать? Потому что это лишний шаг.

Jonas: Yes. I think I’ve not done a great job yet of calibrating the model on when to reproduce these things.

Jonas: Да. Я думаю, что пока не отлично откалибровал модель на то, когда воспроизводить.

Yeah. Sometimes it will do it of its own accord. Yeah. We’ve been conservative where we try to have it only do it when it’s [00:13:00] quite sure because it does add some amount of time to how long it takes it to work on it. But we also have added things like the slash repro command where you can just do, fix this bug slash repro and then it will know that it should first make you a video of it actually finding and making sure it can reproduce the bug.

Да. Иногда она делает это по собственной инициативе. Да. Мы были консервативны — стараемся, чтобы она делала это только когда [00:13:00] довольно уверена, потому что это добавляет время к работе. Но мы также добавили такие штуки, как /repro, где ты можешь просто написать «fix this bug /repro» — и она поймёт, что должна сначала сделать тебе видео того, как она реально находит и убеждается, что может воспроизвести баг.

swyx: Yeah. Yeah. One sort of ML topic this ties into is reward hacking, where while you write test that you update only pass. So first write test, it shows me it fails, then make you test pass, which is a classic like red green.

swyx: Да. Один ML-сюжет, к которому это привязано — reward hacking. Ты пишешь тест, который должен пройти. Сначала пишешь тест, он показывает мне, что тест падает, потом доводишь тест до прохождения — это классический red-green.

Jonas: Yep.

Jonas: Угу.

swyx: Like

swyx: Типа

Jonas: A-T-D-D-T-D-D

Jonas: TDD.

swyx: thing.

swyx: штука.

No, very cool. Was that the last demo? Is there

Очень круто. Это была последняя демка? Есть

Jonas: Yeah.

Jonas: Да.

Anything I missed on the demos or points that you think? I think that

Что-нибудь, что я упустил по демкам или моментам, которые, как ты думаешь?

Samantha: covers it well. Yeah.

Samantha: Думаю, это покрывает всё. Да.

swyx: Cool. Before we stop the screen share, can you gimme like a, just a tour of the slash commands ‘cause I so God ready. Huh, what? What are the good ones?

swyx: Круто. Прежде чем мы прекратим расшаривать экран — можешь дать просто тур по slash-командам? Какие хорошие?

Samantha: Yeah, we wanna increase discoverability around this too.

Samantha: Да, мы хотим повысить discoverability здесь.

I think that’ll be like a future thing we work on. Yeah. But there’s definitely a lot of good stuff now

Это, наверное, будет одна из будущих вещей, над которой будем работать. Да. Но точно есть много хорошего сейчас.

Jonas: we have a lot of internal ones that I think will not be that interesting. Here’s an internal one that I’ve made. I don’t know if anyone else at Cursor uses this one. Fix bb.

Jonas: У нас много внутренних, которые, кажется, не особо интересные. Вот внутренняя, которую сделал я. Не знаю, использует ли ещё кто-то в Cursor — Fix BB.

Samantha: I’ve never heard of it.

Samantha: Я никогда не слышала.

Jonas: Yeah.[00:14:00]

Jonas: Ага. [00:14:00]

Fix Bug Bot. So this is a thing that we want to integrate more tightly on. So you made it for

Fix Bug Bot. Это штука, которую мы хотим интегрировать плотнее. Так что ты сделал её для

swyx: yourself.

swyx: себя.

Jonas: I made this for myself. It’s actually available to everyone in the team, but yeah, no one knows about it. But yeah, there will be Bug bot comments and so Bug Bot has a lot of cool things. We actually just launched Bug Bot Auto Fix, where you can click a button and or change a setting and it will automatically fix its own things, and that works great in a bunch of cases.

Jonas: Я сделал это для себя. Это доступно всем в команде, но никто о ней не знает. Но да, будут комменты Bug Bot, и у Bug Bot много крутых штук. Мы недавно запустили Bug Bot Auto Fix — можно кликнуть кнопку или поменять настройку, и он автоматически чинит свои находки, и это отлично работает в куче случаев.

There are some cases where having the context of the original agent that created the PR is really helpful for fixing the bugs, because it might be like, oh, the bug here is that this, is a regression and actually you meant to do something more like that. And so having the original prompt and all of the context of the agent that worked on it, and so here I could just do, fix or we used to be able to do fixed PB and it would do that.

Есть случаи, когда контекст оригинального агента, создавшего PR, реально полезен для починки бага — потому что может оказаться, что баг — это регрессия, и на самом деле ты хотел сделать что-то скорее так. И вот наличие исходного промта и всего контекста агента, который работал над этим — и здесь я мог просто написать /fix… раньше мы могли писать /fixbb, и он это делал.

No test is another one that we’ve had. Slash repro is in here. We mentioned that one.

/no test — ещё одна. /repro здесь — мы её упоминали.

Samantha: One of my favorites is cloud agent diagnosis. This is one that makes heavy use of the Datadog MCP. Okay. And I [00:15:00] think Nick and David on our team wrote, and basically if there is a problem with a cloud agent we’ll spin up a bunch of subs.

Samantha: Одна из моих любимых — cloud agent diagnosis. Эта использует Datadog MCP по полной. Окей. И я думаю, [00:15:00] Nick и David из нашей команды её написали. И по сути, если есть проблема с облачным агентом — мы поднимаем кучу sub-…

Like a single

Одного

swyx: instance.

swyx: экземпляра.

Samantha: Yeah. We’ll take the ideas and argument and spin up a bunch of subagents using the Datadog MCP to explore the logs and find like all of the problems that could have happened with that. It takes the debugging time, like from potentially you can do quick stuff quickly with the Datadog ui, but it takes it down to, again, like a single agent call as opposed to trolling through logs yourself.

Samantha: Да. Берём идеи и аргументы, поднимаем кучу subagents, которые через Datadog MCP исследуют логи и находят все проблемы, что могли случиться. Это сокращает время отладки — потенциально, через Datadog UI ты можешь делать быстрые вещи быстро, но это сокращает до одного вызова агента, вместо того чтобы самому копаться в логах.

Jonas: You should also talk about the stuff we’ve done with transcripts.

Jonas: Тебе стоит ещё рассказать про то, что мы делали с транскриптами.

Samantha: Yes. Also so basically we’ve also done some things internally. There’ll be some versions of this as we ship publicly soon, where you can spit up an agent and give it access to another agent’s transcript to either basically debug something that happened.

Samantha: Да. По сути, мы внутри сделали такие штуки — будут публичные версии скоро, где ты можешь поднять агента и дать ему доступ к транскрипту другого агента, чтобы либо отладить произошедшее.

So act as an external debugger. I see. Or continue the conversation. Almost like forking it.

Действовать как внешний дебаггер. Понятно. Или продолжить разговор — почти как fork.

swyx: A transcript includes all the chain of thought for the 11 minutes here. 45 minutes there.

swyx: Транскрипт включает всю chain of thought за 11 минут здесь, 45 минут там.

Samantha: Yeah. That way. Exactly. So basically acting as a like secondary agent that debugs the first, so we’ve started to push more and

Samantha: Да. Именно. По сути, действуя как вторичный агент, отлаживающий первого. Мы начали выкатывать это и

swyx: they’re all the same [00:16:00] code.

swyx: они все — один и тот же [00:16:00] код.

It is just the different prompts, but the sa the same.

Это просто разные промты, но сам код тот же.

Samantha: Yeah. So basically same cloud agent infrastructure and then same harness. And then like when we do things like include, there’s some extra infrastructure that goes into piping in like an external transcript if we include it as an attachment.

Samantha: Да. По сути та же инфра облачного агента и тот же harness. И когда мы делаем такие вещи как включение — есть дополнительная инфраструктура, которая пробрасывает внешний транскрипт, если мы прикладываем его как attachment. Но для таких штук, как cloud agent diagnosis, в основном используется Datadog MCP. Мы также запустили MCPs вместе с launch-ом cloud agent — поддержку cloud agent MCPs.

But for things like the cloud agent diagnosis, that’s mostly just using the Datadog MCP. ‘Cause we also launched CPS along with along with this cloud agent launch, launch support for cloud agent cps.

swyx: О, это было упущено.

swyx: Oh, that was drawn out.

Jonas: Мы будем делать большой маркетинговый момент по этому на следующей неделе, но да, теперь можно использовать MCPs и

Jonas: We won’t, we’ll be doing a bigger marketing moment for it next week, but, and you can now use CPS and

swyx: люди тоже это послушают.

swyx: People will listen to it as well.

Да,

Yeah,

Jonas: они

Jonas: they’ll

Samantha: будут на третий день впереди. Будут впереди. И я… я на самом деле не знаю, доступен ли Datadog MCP публично. Понимаю, что не уверена — я её бета-тестирую, но это одна из моих любимых.

Samantha: be ahead of the third. They’ll be ahead. And I would I actually don’t know if the Datadog CP is like publicly available yet. I realize this not sure beta testing it, but it’s been one of my favorites to use. So

swyx: Я думаю, эта интересная для Datadog. Потому что Datadog хочет владеть этим сегментом.

swyx: I think that one’s interesting for Datadog. ‘cause Datadog wants to own that site.

Интересно с Bits. Не знаю, пробовала ли ты Bits.

Interesting with Bits. I don’t know if you’ve tried bits.

Samantha: Не пробовала Bits.

Samantha: I haven’t tried bits.

swyx: Да.

swyx: Yeah.

Jonas: Это их продукт cloud agent.

Jonas: That’s their cloud agent

swyx: Да. Да. Они хотят быть тем — мы владеем твоими логами и даём… частью [00:17:00] self-healing software, который все хотят. Да. Но очевидно, у Cursor сильное мнение по поводу coding agents, и ты — у тебя — забираешь это себе. Что очевидно, ты будешь делать, и не каждая компания — это Cursor, но интересно — если ты Datadog, что ты делаешь здесь?

swyx: product. Yeah. Yeah. They want to be like we own your logs and give us our, some part of the, [00:17:00] self-healing software that everyone wants. Yeah. But obviously Cursor has a strong opinion on coding agents and you, you like taking away from the which like obviously you’re going to do, and not every company’s like Cursor, but it’s interesting if you’re a Datadog, like what do you do here?

Раскрываешь свои логи через MCP и даёшь другим работать с ними? Или пытаешься владеть этим, потому что это дополнительный бизнес? Да. Интересный вопрос.

Do you expose your logs to FDP and let other people do it? Or do you try to own that it because it’s extra business for you? Yeah. It’s like an interesting one.

Samantha: Хороший вопрос. Всё, что я знаю — я обожаю Datadog MCP,

Samantha: It’s a good question. All I know is that I love the Datadog MCP,

Jonas: И да, не сюрприз, что люди будут это требовать, верно?

Jonas: And yeah, it is gonna be no, no surprise that people like will demand it, right?

Samantha: Да.

Samantha: Yeah.

swyx: Это, это как любая

swyx: It’s, it’s like any

система

system

swyx: system-of-record компания — это вопрос «сколько отдавать?». Круто. Думаю, это всё по туру по cloud agents. Круто. Просто поговорим — cloud agents были… когда Kristen впервые полюбила cloud agents? Знаешь, в июне

swyx: of record company like this, it’s like how much do you give away? Cool. I think that’s that for the sort of cloud agents tour. Cool. And we just talk about like cloud agents have been when did Kirsten loves cloud agents? Do you know, in June

Jonas: в прошлом году.

Jonas: last year.

swyx: В июне прошлого года. То есть это медленно разворачивалось — то, что вы делали, типа Michael сделал пост, где сам показал этот график — агенты обгоняют tab. И я такой: вау, это же крупнейший переход в коде.

swyx: June last year. So it’s been slowly develop the thing you did, like a bunch of, like Michael did a post where himself, where he like showed this chart of like ages overtaking tap. And I’m like, wow, this is like the biggest transition in code.

Jonas: Да.

Jonas: Yeah.

swyx: За [00:18:00] последний

swyx: Like in, in [00:18:00] like the last,

Jonas: да. Я думаю, это получило огласку.

Jonas: yeah. I think that kind of got turned out.

Да. Я думаю, это очень интересный…

Yeah. I think it’s a very interest,

swyx: не очень. Я думаю, это подсветил наш друг Andrej Karpathy сегодня.

swyx: not at all. I think it’s been highlighted by our friend Andre Kati today.

Jonas: Окей.

Jonas: Okay.

swyx: Расскажи подробнее. Что это значит? Да. Мне только что дали клавишу Cursor Tab.

swyx: Talk more about it. What does it mean? Yeah. Is I just got given like the cursor tab key.

Jonas: Да. Да.

Jonas: Yes. Yes.

swyx: Это

swyx: That’s that’s

Samantha: круто.

Samantha: cool.

swyx: Знаю, но её скоро поставят в музей.

swyx: I know, but it’s gonna be like put in a museum.

Jonas: Так и есть.

Jonas: It is.

Samantha: Должна сказать, я сама уже мало использую tab.

Samantha: I have to say I haven’t used tab a little bit myself.

Jonas: Да. Я думаю, то, как выглядит написание кода с AI — создание софта в общем, даже если хочется уйти на более высокий уровень — меняется очень быстро. Не горячий тейк, но я думаю, что с нашего ракурса в Cursor одна из вещей, которая, наверное, недооценивается извне, — это то, что мы очень самосознательно относимся к этому факту. Cursor стартовал в фазе один, эре tab и автокомплита.

Jonas: Yeah. I think that what it looks like to code with AI code generally creates software, even if you want to go higher level. Is changing very rapidly. No, not a hot take, but I think from our vendor’s point at Cursor, I think one of the things that is probably underappreciated from the outside is that we are extremely self-aware about that fact and Kerscher, got its start in phase one, era one of like tab and auto complete.

И это было реально полезно в своё время. Но многие начинают смотреть на текстовые файлы и редактировать код — мы называем это hand coding теперь, когда ты реально печатаешь буквы, это

And that was really useful in its time. But a lot of people start looking at text files and editing code, like we call it hand coding. Now when you like type out the actual letters, it’s

swyx: о, это мило.

swyx: oh that’s cute.

Jonas: Ага.

Jonas: Yeah.

swyx: О, это мило.

swyx: Oh that’s cute.

Jonas: Ты такой бумер. Такой бумер. [00:19:00] И это, я думаю, было медленно ускоряющимся, а в последние месяцы — стремительно ускоряющимся сдвигом.

Jonas: You’re so boomer. So boomer. [00:19:00] And so that I think has been a slowly accelerating and now in the last few months, rapidly accelerating shift.

И мы думаем, что то же самое случится со следующей вещью, где — я думаю, некоторые боли вокруг tab: он классный, но я просто хочу отдать больше агенту, и я не хочу делать по одному tab за раз. Я хочу просто дать ему задачу — и он уйдёт и сделает более крупный кусок работы, а я могу

And we think that’s going to happen again with the next thing where the, I think some of the pains around tab of it’s great, but I actually just want to give more to the agent and I don’t want to do one tab at a time. I want to just give it a task and it goes off and does a larger unit of work and I can.

откинуться чуть назад и оперировать на более высоком уровне абстракции. Это случится снова — от агентов, возвращающих тебе diff’ы, где ты в полях и даёшь им 30-секундные — 3-минутные задачи, до того, что ты даёшь им 3-минутные — 30-минутные — 3-часовые задачи и получаешь обратно видео и пробуешь превью, а не сразу смотришь на diff каждый раз.

Lean back a little bit more and operate at that higher level of abstraction that’s going to happen again, where it goes from agents handing you back diffs and you’re like in the weeds and giving it, 32nd to three minute tasks, to, you’re giving it, three minute to 30 minute to three hour tasks and you’re getting back videos and trying out previews rather than immediately looking at diffs every single time.

swyx: Да. Что-нибудь добавишь?

swyx: Yeah. Anything to add?

Samantha: Один другой сдвиг, который я заметила, когда наши облачные агенты внутри по-настоящему взлетели — это сдвиг от преимущественно индивидуальной разработки к почти коллаборативной природе разработки. Для нас Slack — это почти что среда разработки [00:20:00].

Samantha: One other shift that I’ve noticed as our cloud agents have really taken off internally has been a shift from primarily individually driven development to almost this collaborative nature of development for us, slack is actually almost like a development on [00:20:00] Id basically.

Так что я

So I

swyx: возможно, даже не строить кастомный UI — может, это штука для отладки, но на самом деле это так и есть.

swyx: like maybe don’t even build a custom ui, like maybe that’s like a debugging thing, but actually it’s that.

Samantha: Я чувствую, да, тут ещё много можно исследовать, но по сути для нас Slack — это место, где происходит много разработки. У нас есть issue-каналы или каналы обсуждения продукта, где люди постоянно @cursing — и это запускает облачного агента.

Samantha: I feel like, yeah, there’s still so much to left to explore there, but basically for us, like Slack is where a lot of development happens. Like we will have these issue channels or just like this product discussion channels where people are always at cursing and that kicks off a cloud agent.

И для нас, по крайней мере, у нас включены team follow-ups. То есть если Jonas запустил Cursor в треде, я могу ответить и добавить контекста. И это превращается почти в discussion-сервис, где люди могут коллаборировать в UI. Часто я запускаю расследование, а потом иногда даже прошу его сделать git blame и тегнуть людей, которых стоит привлечь. Потому что он может тегать людей в Slack, и потом другие люди приходят

And for us at least, we have team follow-ups enabled. So if Jonas kicks off at Cursor in a thread, I can follow up with it and add more context. And so it turns into almost like a discussion service where people can like collaborate on ui. Oftentimes I will kick off an investigation and then sometimes I even ask it to get blame and then tag people who should be brought in. ‘cause it can tag people in Slack and then other people will come

swyx: может тегать других людей, которые не вовлечены в разговор? Да. Может просто @Jonas, скажем, если был…

swyx: in, can tag other people who are not involved in conversation. Yes. Can just do at Jonas if say, was talking to,

Samantha: Да.

Samantha: yeah.

swyx: Круто. Вам ребят стоит сделать из этого большое событие.

swyx: That’s cool. You should, you guys should make a big good deal outta that.

Samantha: Знаю. Я чувствую, что с нашей Slack-поверхностью ещё много чего можно показать внешне. Но да, по сути, [00:21:00] он может привлекать других людей, и потом эти люди тоже могут вносить вклад в тред, и в итоге ты получаешь PR с видимыми артефактами, и люди могут сказать: «Окей, круто, мерджим».

Samantha: I know. It’s a lot to, I feel like there’s a lot more to do with our slack surface area to show people externally. But yeah, basically like it [00:21:00] can bring other people in and then other people can also contribute to that thread and you can end up with a PR again, with the artifacts visible and then people can be like, okay, cool, we can merge this.

То есть для нас IDE как бы переезжает в Slack в некотором смысле.

So for us it’s like the ID is almost like moving into Slack in some ways as well.

swyx: У меня такой же опыт, но это не разработчики, это я.

swyx: I have the same experience with, but it’s not developers, it’s me. Designer salespeople.

Samantha: Да.

Samantha: Yeah.

swyx: Дизайнер, sales-люди.

swyx: So me on like technical marketing, vision, designer on design and then salespeople on here’s the legal source of what we agreed on.

Samantha: Да.

And then they all just collaborate and correct. The agents,

swyx: Я по техническому маркетингу, видению, дизайнер по дизайну, sales-люди — вот легальный источник того, о чём договорились.

Jonas: I think that we found when these threads is. The work that is left, that the humans are discussing in these threads is the nugget of what is actually interesting and relevant. It’s not the boring details of where does this if statement go?

И они все просто коллаборируют и поправляют агентов. Верно.

It’s do we wanna ship this? Is this the right ux? Is this the right form factor? Yeah. How do we make this more obvious to the user? It’s like those really interesting kind of higher order questions that are so easy to collaborate with and leave the implementation to the cloud agent.

Jonas: Я думаю, мы поняли, что когда есть эти треды — та работа, которая остаётся людям и которую они обсуждают в этих тредах — это и есть тот самородок интересного и релевантного. Это не скучные детали типа «где должен стоять этот if». Это «хотим ли мы это вообще релизить? Это правильный UX? Это правильный форм-фактор?»

Samantha: Totally. And no more discussion of am I gonna do this? Are you [00:22:00] gonna do this cursor’s doing it? You just have to decide. You like it.

Да. «Как сделать это более очевидным для пользователя?». Вот эти реально интересные, более высокого порядка вопросы, по которым очень легко коллаборировать — и оставить имплементацию облачному агенту.

swyx: Sometimes the, I don’t know if there’s a, this probably, you guys probably figured this out already, but since I, you need like a mute button. So like cursor, like we’re going to take this offline, but still online.

Samantha: Полностью. И больше нет обсуждений: «Я буду это делать? Ты будешь это делать? [00:22:00] Cursor это делает». Тебе просто нужно решить, нравится ли тебе.

But like we need to talk among the humans first. Before you like could stop responding to everything.

swyx: Иногда, не знаю, есть ли — вы, ребят, наверное, уже сообразили — но поскольку я… нужна кнопка mute. Типа «cursor, мы возьмём это оффлайн, но всё ещё онлайн, нам надо поговорить среди людей сначала, прежде чем ты можешь… ты можешь перестать отвечать на всё».

Jonas: Yeah. This is a design decision where currently cursor won’t chime in unless you explicitly add Mention it. Yeah. Yeah.

Jonas: Ага. Это дизайн-решение — сейчас Cursor не будет встревать, пока ты явно не упомянешь его. Да. Да.

Samantha: So it’s not always listening.

Samantha: То есть он не всегда слушает. Да.

Yeah.

Jonas: I can see all the intermediate messages.

Jonas: Я могу видеть все промежуточные сообщения.

swyx: Have you done the recursive, can cursor add another cursor or spawn another cursor?

swyx: Вы делали рекурсивное — Cursor может добавить другого Cursor или заспаунить другого Cursor?

Samantha: Oh,

Samantha: О,

Jonas: we’ve done some versions of this.

Jonas: у нас были версии этого.

swyx: Because, ‘cause it can add humans.

swyx: Потому что — потому что он может добавлять людей.

Jonas: Yes. One of the other things we’ve been working on that’s like an implication of generating the code is so easy is getting it to production is still harder than it should be.

Jonas: Да. Одна из других вещей, над которыми мы работали, — следствие того, что генерировать код стало так легко, — это то, что довезти его до прода всё ещё сложнее, чем должно быть.

And broadly, you solve one bottleneck and three new ones pop up. Yeah. And so one of the new bottlenecks is getting into production and we have a like joke internally where you’ll be talking about some feature and someone says, I have a PR for that. Which is it’s so easy [00:23:00] to get to, I a PR for that, but it’s hard still relatively to get from I a PR for that to, I’m confident and ready to merge this.

И в целом — ты решаешь одно бутылочное горлышко, а появляются три новых. Да. Одно из новых — это попадание в прод, и у нас есть внутренняя шутка: ты обсуждаешь какую-то фичу, и кто-то говорит: «У меня уже PR на это». Это так легко — добраться до «у меня PR на это», [00:23:00] но всё ещё относительно сложно дойти от «у меня PR на это» до «я уверен и готов это мерджить».

And so I think that over the coming weeks and months, that’s a thing that we think a lot about is how do we scale up compute to that pipeline of getting things from a first draft An agent did.

И я думаю, что в ближайшие недели и месяцы это то, о чём мы много думаем — как масштабировать compute в этом пайплайне доставки от первого черновика, который сделал агент.

swyx: Isn’t that what Merge isn’t know what graphite’s for, like

swyx: Это разве не то, для чего merge — типа графит, как…

Jonas: graphite is a big part of that. The cloud agent testing

Jonas: графит — большая часть этого. Cloud agent testing

swyx: Is it fully integrated or still different companies

swyx: это полностью интегрировано, или это всё ещё разные компании

Jonas: working on I think we’ll have more to share there in the future, but the goal is to have great end-to-end experience where Cursor doesn’t just help you generate code tokens, it helps you create software end-to-end.

Jonas: работают над — мы расскажем больше в будущем, но цель — иметь отличный end-to-end опыт, где Cursor не просто помогает тебе генерировать токены кода, он помогает тебе создавать софт end-to-end.

And so review is a big part of that, that I think especially as models have gotten much better at writing code, generating code, we’ve felt that relatively crop up more,

И ревью — большая часть этого, и в особенности по мере того, как модели стали гораздо лучше писать код, генерировать код, мы почувствовали, что это относительно вылезает чаще.

swyx: sorry this is completely unplanned, but like there I have people arguing one to you need ai. To review ai and then there is another approach, thought school of thought where it’s no, [00:24:00] reviews are dead.

swyx: Извини, это совершенно незапланированное — есть люди, которые спорят: одна сторона — нужен AI для ревью AI, и есть другой подход, школа мысли, где: нет, [00:24:00] ревью мертво.

Like just show me the video. It’s it like,

Типа просто покажи мне видео.

Samantha: yeah. I feel again, for me, the video is often like alignment and then I often still wanna go through a code review process.

Samantha: Да. Я снова чувствую, для меня видео — это часто согласование, и я всё равно часто хочу пройти через процесс ревью кода.

swyx: Like still look at the files and

swyx: Типа всё равно смотреть на файлы и

Samantha: everything. Yeah. There’s a spectrum of course. Like the video, if it’s really well done and it does like fully like test everything, you can feel pretty competent, but it’s still helpful to, to look at the code.

Samantha: всё. Да. Тут есть спектр, конечно. Если видео реально хорошо сделано и реально полностью всё тестирует, можно чувствовать себя достаточно уверенно, но всё равно полезно посмотреть код.

I make hep pay a lot of attention to bug bot. I feel like Bug Bot has been a great really highly adopted internally. We often like, won’t we tell people like, don’t leave bug bot comments unaddressed. ‘cause we have such high confidence in it. So people always address their bug bot comments.

Я много внимания уделяю Bug Bot. Bug Bot стал реально широко адоптирован внутри. Мы часто говорим людям: не оставляйте комменты Bug Bot без ответа. Потому что у нас высокая уверенность в нём. Поэтому люди всегда отвечают на комменты Bug Bot.

Jonas: Once you’ve had two cases where you merged something and then you went back later, there was a bug in it, you merged, you went back later and you were like, ah, bug Bot had found that I should have listened to Bug Bot.

Jonas: Стоит два раза в жизни замерджить что-то, потом вернуться позже — а там баг, который ты замерджил, и ты возвращаешься позже и такой: «Эх, Bug Bot же нашёл это, надо было послушать Bug Bot».

Once that happens two or three times, you learn to wait for bug bot.

Стоит этому случиться два-три раза — ты учишься ждать Bug Bot.

Samantha: Yeah. So I think for us there’s like that code level review where like it’s looking at the actual code and then there’s like the like feature level review where you’re looking at the features. There’s like a whole number of different like areas.

Samantha: Да. Я думаю, для нас есть это code-level review, где смотрят на сам код, и feature-level review, где смотрят на фичи. Есть целый ряд разных областей.

There’ll probably eventually be things like performance level review, security [00:25:00] review, things like that where it’s like more more different aspects of how this feature might affect your code base that you want to potentially leverage an agent to help with.

Со временем будут такие вещи, как performance-level review, security [00:25:00] review — где это более разные аспекты того, как фича может повлиять на ваш кодбейс, для которых вы потенциально захотите использовать агента.

Jonas: And some of those like bug bot will be synchronous and you’ll typically want to wait on before you merge.

Jonas: И некоторые из них, как Bug Bot, будут синхронными, и ты обычно захочешь его дождаться перед мерджем.

But I think another thing that we’re starting to see is. As with cloud agents, you scale up this parallelism and how much code you generate. 10 person startups become, need the Devrel X and pipelines that a 10,000 person company used to need. And that looks like a lot of the things I think that 10,000 person companies invented in order to get that volume of software to production safely.

Но я думаю, ещё одна штука, которую мы начинаем видеть, — с облачными агентами ты масштабируешь параллелизм и объём генерируемого кода. Стартапы из 10 человек начинают нуждаться в DevEx и пайплайнах, которые раньше нужны были компании из 10 000 человек. И это выглядит как многие вещи, которые компании из 10 000 человек изобрели, чтобы безопасно довозить такой объём софта в прод.

So that’s things like, release frequently or release slowly, have different stages where you release, have checkpoints, automated ways of detecting regressions. And so I think we’re gonna need stacks merg stack diffs merge queues. Exactly. A lot of those things are going to be important

То есть вещи как «релизься часто», или «релизься медленно», «имей разные стадии релиза», «имей чекпоинты», «автоматизированные способы детекции регрессий». Я думаю, нам понадобятся stacked diffs, merge queues. Именно. Многие из этих вещей будут важны

swyx: forward with.

swyx: вперёд.

I think the majority of people still don’t know what stack stacks are. And I like, I have many friends in Facebook and like I, I’m pretty friendly with graphite. I’ve just, [00:26:00] I’ve never needed it ‘cause I don’t work on that larger team and it’s just like democratization of no, only here’s what we’ve already worked out at very large scale and here’s how you can, it benefits you too.

Я думаю, большинство людей до сих пор не знают, что такое stacks. У меня много друзей в Facebook, и я… я довольно дружу с Graphite. Я просто… [00:26:00] никогда мне это не было нужно, потому что я не работаю в такой большой команде, и это просто демократизация — нет, вот то, что мы уже решили на очень большом масштабе, и вот как это полезно тебе тоже.

Like I think to me, one of the beautiful things about GitHub is that. It’s actually useful to me as an individual solo developer, even though it’s like actually collaboration software.

Для меня одна из прекрасных вещей в GitHub — то, что он реально полезен мне как индивидуальному соло-разработчику, хотя это collaboration software.

Jonas: Yep.

Jonas: Угу.

swyx: And I don’t think a lot of Devrel tools have figured that out yet. That transition from like large down to small.

swyx: И я не думаю, что много DevEx-инструментов это поняли. Этот переход с большого на маленькое.

Jonas: Yeah. Kers is probably an inverse story.

Jonas: Да. Cursor — это, наверное, обратная история.

swyx: This is small down to

swyx: Это малое — на

Jonas: Yeah. Where historically Kers share, part of why we grew so quickly was anyone on the team could pick it up and in fact people would pick it up, on the weekend for their side project and then bring it into work. ‘cause they loved using it so much.

Jonas: Да. Где исторически Cursor, часть того, почему мы росли так быстро, — это что кто угодно в команде мог его подхватить, и фактически люди подхватывали его на выходных для side-проектов, и потом приносили на работу. Потому что им так нравилось его использовать.

swyx: Yeah.

swyx: Да.

Jonas: And I think a thing that we’ve started working on a lot more, not us specifically, but as a company and other folks at Cursor, is making it really great for teams and making it the, the 10th person that starts using Cursor in a team. Is immediately set up with things like, we launched Marketplace recently so other people can [00:27:00] configure what CPS and skills like plugins.

Jonas: И я думаю, вещь, над которой мы начали работать больше — не мы конкретно, но как компания и другие в Cursor — это делать его реально классным для команд и делать так, чтобы 10-й человек, начавший использовать Cursor в команде, сразу был настроен. Мы недавно запустили Marketplace, чтобы другие могли [00:27:00] конфигурировать MCPs и skills (плагины).

So skills and cps, other people can configure that. So that my cursor is ready to go and set up. Sam loves the Datadog, MCP and Slack, MCP you’ve also been using a lot but

То есть skills и MCPs другие люди могут сконфигурировать, чтобы мой Cursor был готов и настроен. Sam обожает Datadog MCP и Slack MCP — ты тоже много её использовала, но

Samantha: also pre-launch, but I feel like it’s so good.

Samantha: тоже пре-релиз, но я чувствую, что она настолько хороша.

Jonas: Yeah, my cursor should be configured if Sam feels strongly that’s just amazing and required.

Jonas: Да, мой Cursor должен быть сконфигурирован — если Sam так считает, это просто потрясающе и необходимо.

swyx: Is it automatically shared or you have to go and.

swyx: Это автоматически шерится, или нужно зайти и…

Jonas: It depends on the MCP. So some are obviously off per user. Yeah. And so Sam can’t off my cursor with my Slack MCP, but some are team off and those can be set up by admins.

Jonas: Зависит от MCP. Некоторые, очевидно, на пользователя. Да. Sam не может включить Slack MCP моего Cursor, но некоторые — team-level, и они могут быть настроены админами.

swyx: Yeah. Yeah. That’s cool. Yeah, I think, we had a man on the pod when cursor was five people, and like everyone was like, okay, what’s the thing?

swyx: Да. Да. Это круто. Да, у нас был человек в подкасте, когда Cursor было пять людей, и все были такие: «Окей, что фишка?»

And then it’s usually something teams and org and enterprise, but it’s actually working. But like usually at that stage when you’re five, when you’re just a vs. Code fork it’s like how do you get there? Yeah. Will people pay for this? People do pay for it.

И обычно это что-то — teams, орг и enterprise, но это реально работает. Но обычно на этой стадии, когда вас пятеро, когда вы просто форк VS Code — это «как вы туда дойдёте?». Да. «Будут ли люди за это платить?». Люди платят.

Jonas: Yeah. And I think for cloud agents, we expect.[00:28:00]

Jonas: Да. И я думаю, для облачных агентов мы ожидаем [00:28:00]

To have similar kind of PLG things where I think off the bat we’ve seen a lot of adoption with kind of smaller teams where the code bases are not quite as complex to set up. Yes. If you need some insane docker layer caching thing for builds not to take two hours, that’s going to take a little bit longer for us to be able to support that kind of infrastructure.

что будут похожие PLG-штуки, где, я думаю, с самого начала мы видели много адопции с меньшими командами, где кодбейзы не такие сложные в настройке. Да. Если тебе нужен какой-то безумный docker layer caching, чтобы билды не занимали два часа — нам понадобится чуть больше времени, чтобы такую инфраструктуру поддержать.

Whereas if you have front end backend, like one click agents can install everything that they need themselves.

Тогда как если у тебя front-end backend — one-click агенты могут установить всё, что им нужно.

swyx: This is a good chance for me to just ask some technical sort of check the box questions. Can I choose the size of the vm?

swyx: Это хороший повод задать тебе пару технических вопросов чек-боксного типа. Могу ли я выбрать размер VM?

Jonas: Not yet. We are planning on adding that. We

Jonas: Пока нет. Планируем добавить.

swyx: have, this is obviously you want like LXXL, whatever, right?

swyx: Очевидно, хочется L, XXL, что угодно — типа как Amazon с этим меню.

Like it’s like the Amazon like sort menu.

Jonas: Yes, exactly. We’ll add that.

Jonas: Да, именно. Добавим.

swyx: Yeah. In some ways you have to basically become like a EC2, almost like you rent a box.

swyx: Да. В каком-то смысле вам надо стать чем-то вроде EC2 — почти как «арендуешь ящик».

Jonas: You rent a box. Yes. We talk a lot about brain in a box. Yeah. So cursor, we want to be a brain in a box,

Jonas: Арендуешь ящик. Да. Мы много говорим про «мозг в коробке». Да. И Cursor мы хотим, чтобы был «мозгом в коробке»,

swyx: but is the mental model different? Is it more serverless?

swyx: но ментальная модель другая? Больше serverless?

Is it more persistent? Is. Something else.

Больше persistent? Что-то ещё?

Samantha: We want it to be a bit persistent. The desktop should be [00:29:00] something you can return to af even after some days. Like maybe you go back, they’re like still thinking about a feature for some period of time. So the

Samantha: Мы хотим, чтобы было немного persistent. Десктоп должен быть [00:29:00] чем-то, к чему можно вернуться даже через несколько дней. Возможно, ты возвращаешься, они всё ещё думают над фичей какое-то время.

swyx: full like sus like suspend the memory and bring it back and then keep going.

swyx: То есть полная приостановка памяти и возвращение, а потом продолжение.

Samantha: Exactly.

Samantha: Именно.

swyx: That’s an interesting one because what I actually do want, like from a manna and open crawl, whatever, is like I want to be able to log in with my credentials to the thing, but not actually store it in any like secret store, whatever. ‘cause it’s like this is the, my most sensitive stuff.

swyx: Это интересно, потому что то, что я хочу — от Manus, Open Crawl, whatever — я хочу иметь возможность залогиниться со своими credentials в эту штуку, но не хранить их в каком-то секрет-сторе или ещё где. Потому что это, типа, моё самое чувствительное.

Yeah. This is like my email, whatever. And just have it like, persist to the image. I don’t know how it was hood, but like to rehydrate and then just keep going from there. But I don’t think a lot of infra works that way. A lot of it’s stateless where like you save it to a docker image and then it’s only whatever you can describe in a Docker file and that’s it.

Да. Это типа моя почта, что угодно. И просто пусть это persist в образе. Не знаю, как это под капотом — типа re-hydrate и продолжать. Но я не думаю, что много infra работает таким образом. Много её stateless, где сохраняешь в Docker-образ, и всё — только то, что можешь описать в Dockerfile.

That’s the only thing you can cl multiple times in parallel.

Это единственная штука, которую можно склонировать много раз параллельно.

Jonas: Yeah. We have a bunch of different ways of setting them up. So there’s a dockerfile based approach. The main default way is actually snapshotting

Jonas: Да. У нас есть куча разных способов их настройки. Есть Dockerfile-based подход. Главный дефолтный способ — это, на самом деле, снапшоттинг

swyx: like a Linux vm

swyx: типа Linux VM

Jonas: like vm, right? You run a bunch of install commands and then you snapshot more or less the file system.

Jonas: как VM, да. Запускаешь кучу install-команд, и потом снапшотишь по сути файловую систему.

And so that gets you set up for everything [00:30:00] that you would want to bring a new VM up from that template basically.

И это настраивает тебя на всё [00:30:00], что нужно, чтобы поднять новую VM из этого шаблона.

swyx: Yeah.

swyx: Да.

Jonas: And that’s a bit distinct from what Sam was talking about with the hibernating and re rehydrating where that is a full memory snapshot as well. So there, if I had like the browser open to a specific page and we bring that back, that page will still be there.

Jonas: И это немного отличается от того, о чём говорила Sam — про hibernating и re-hydrating, где это полный снапшот памяти. Там, если у меня был открыт браузер на конкретной странице, и мы возвращаем это назад — эта страница всё ещё там.

swyx: Was there any discussion internally and just building this stuff about every time you shoot a video it’s actually you show a little bit of the desktop and the browser and it’s not necessary if you just show the browser. If, if you know you’re just demoing a front end application.

swyx: Было ли внутри обсуждение того, что каждый раз, когда ты снимаешь видео, ты показываешь немного десктопа и браузер, и это не нужно, если ты показываешь только браузер. Если ты, ну, ты знаешь, демонстрируешь front-end приложение.

Why not just show the browser, right? Like it Yeah,

Почему не просто показать браузер? Типа,

Samantha: we do have some panning and zooming. Yeah. Like it can decide that when it’s actually recording and cutting the video to highlight different things. I think we’ve played around with different ways of segmenting it and yeah. There’s been some different revs on it for sure.

Samantha: у нас есть panning и зум. Да. Он может решать, когда записывает и режет видео, чтобы подсветить разные вещи. Я думаю, мы пробовали разные способы сегментации, и да, там было несколько ревизий.

Jonas: Yeah. I think one of the interesting things is the version that you see now in cursor.com actually is like half of what we had at peak where we decided to unshift or unshipped quite a few things. So two of the interesting things to talk about, one is directly an answer to your [00:31:00] question where we had native browser that you would have locally, it was basically an iframe that via port forwarding could load the URL could talk to local host in the vm.

Jonas: Да. Я думаю, одна из интересных штук — версия, которую ты сейчас видишь на cursor.com — это половина того, что у нас было на пике, мы решили unship довольно много штук. Две интересные. Одна напрямую отвечает на твой [00:31:00] вопрос: у нас был нативный браузер, который ты имел локально — это был, по сути, iframe, который через port-forwarding мог грузить URL, мог общаться с localhost в VM.

So that gets you basically, so in

То есть это даёт тебе… ну, в

swyx: your machine’s browser,

swyx: браузере твоего компьютера

like

типа

Jonas: in your local browser? Yeah. You would go to local host 4,000 and that would get forwarded to local host 4,000 in the VM via port forward. We unshift that like at

Jonas: в твоём локальном браузере. Да. Идёшь на localhost:4000, и это перенаправляется на localhost:4000 в VM через port-forward. Мы это un-ship’нули

swyx: Eng Rock.

swyx: как ngrok.

Jonas: Like an Eng Rock. Exactly. We unshift that because we felt that the remote desktop was sufficiently low latency and more general purpose.

Jonas: Как ngrok. Именно. Мы это un-ship’нули, потому что почувствовали, что remote desktop достаточно low-latency и более general-purpose.

So we build Cursor web, but we also build Cursor desktop. And so it’s really useful to be able to have the full spectrum of things. And even for Cursor Web, as you saw in one of the examples, the agent was uploading files and like I couldn’t upload files and open the file viewer if I only had access to the browser.

Поэтому мы строим Cursor Web, но также строим Cursor Desktop. И реально полезно иметь весь спектр. И даже для Cursor Web, как вы видели в одном из примеров, агент загружал файлы — а я не мог бы загрузить файлы и открыть file viewer, если бы у меня был доступ только к браузеру.

And we’ve thought a lot about, this might seem funny coming from Cursor where we started as this, vs. Code Fork and I think inherited a lot of amazing things, but also a lot [00:32:00] of legacy UI from VS Code.

И мы много думали — это может показаться смешным от Cursor, который начинал как VS Code Fork и унаследовал кучу классных вещей, но также кучу [00:32:00] legacy UI от VS Code.

Minimal Web UI Surfaces

Минимальные поверхности веб-UI

Jonas: And so with the web UI we wanted to be very intentional about keeping that very minimal and exposing the right sum of set of primitive sort of app surfaces we call them, that are shared features of that cloud.

Jonas: И вот с web UI мы хотели быть очень целенаправленными — держать его минимальным и выставлять правильный набор примитивных app-поверхностей, как мы их называем, которые являются общими фичами того облачного

Environment that you and the agent both use. So agent uses desktop and controls it. I can use desktop and controlled agent runs terminal commands. I can run terminal commands. So that’s how our philosophy around it. The other thing that is maybe interesting to talk about that we unshipped is and we may, both of these things we may reship and decide at some point in the future that we’ve changed our minds on the trade offs or gotten it to a point where, put

окружения, которое использует и ты, и агент. Агент использует desktop и контролирует его. Я могу использовать desktop и контролировать. Агент запускает terminal-команды. Я могу запускать terminal-команды. Вот наша философия. Другая штука, о которой, может быть, интересно поговорить — что мы un-ship’нули, и обе эти штуки мы можем вернуть и решить в какой-то момент в будущем, что мы изменили мнение про trade-offs или довели до такого уровня… выкатить

swyx: it out there.

swyx: это.

Let users tell you they want it. Exactly. Alright, fine.

Пусть пользователи скажут, что им это нужно. Именно. Окей, хорошо.

Why No File Editor

Почему нет файлового редактора

Jonas: So one of the other things is actually a files app. And so we used to have the ability at one point during the process of testing this internally to see next to, I had GID desktop and terminal on the right hand side of the tab there earlier to also have a files app where you could see and edit files.

Jonas: Так вот, одна из других штук — это files-приложение. У нас была в какой-то момент во время внутреннего тестирования возможность видеть рядом — у меня были desktop и terminal справа в табах ранее — также files-app, где можно было видеть и редактировать файлы.

And we actually felt that in some [00:33:00] ways, by restricting and limiting what you could do there, people would naturally leave more to the agent and fall into this new pattern of delegating, which we thought was really valuable. And there’s currently no way in Cursor web to edit these files.

И мы реально почувствовали, что в каком-то смысле, [00:33:00] ограничивая то, что можно сделать там, люди будут естественно больше оставлять агенту и попадать в этот новый паттерн делегирования, что мы посчитали очень ценным. И сейчас в Cursor Web нет способа редактировать эти файлы.

swyx: Yeah. Except you like open up the PR and go into GitHub and do the thing.

swyx: Да. Кроме как открыть PR, зайти в GitHub и сделать там.

Jonas: Yeah.

Jonas: Да.

swyx: Which is annoying.

swyx: Что раздражает.

Jonas: Just tell the agent,

Jonas: Просто скажи агенту,

swyx: I have criticized open AI for this. Because Open AI is Codex app doesn’t have a file editor, like it has file viewer, but isn’t a file editor.

swyx: Я критиковал OpenAI за это. Потому что приложение Codex от OpenAI не имеет file editor — есть file viewer, но нет file editor.

Jonas: Do you use the file viewer a lot?

Jonas: А ты много используешь file viewer?

swyx: No. I understand, but like sometimes I want it, the one way to do it is like freaking going to no, they have a open in cursor button or open an antigravity or, opening whatever and people pointed that.

swyx: Нет. Я понимаю, но иногда я хочу. Единственный способ — это чёрт возьми идти в… нет, у них есть кнопка «open in Cursor», или «open in Antigravity», или «open in whatever», и люди указывали на это.

So I was, I was part of the early testers group people pointed that and they were like, this is like a design smell. It’s like you actually want a VS. Code fork that has all these things, but also a file editor. And they were like, no, just trust us.

Я был частью группы ранних тестеров. Люди указывали на это, и они говорили: это design smell. Тебе на самом деле нужен VS Code Fork, у которого есть все эти штуки, но ещё и file editor. А они: «Нет, просто доверьтесь нам».

Jonas: Yeah. I think we as Cursor will want to, as a product, offer the [00:34:00] whole spectrum and so you want to be able to.

Jonas: Да. Я думаю, мы как Cursor хотим как продукт предлагать [00:34:00] весь спектр, и поэтому ты хочешь иметь возможность

Work at really high levels of abstraction and double click and see the lowest level. That’s important. But I also think that like you won’t be doing that in Slack. And so there are surfaces and ways of interacting where in some cases limiting the UX capabilities makes for a cleaner experience that’s more simple and drives people into these new patterns where even locally we kicked off joking about this.

работать на реально высоких уровнях абстракции и кликнуть и увидеть самый низкий уровень. Это важно. Но я также думаю, что ты не будешь это делать в Slack. И поэтому есть поверхности и способы взаимодействия, где в некоторых случаях ограничение UX-возможностей создаёт более чистый опыт, более простой, и подталкивает людей в эти новые паттерны, где даже локально — мы шутили об этом — люди реально не редактируют файлы, не пишут код руками.

People like don’t really edit files, hand code anymore. And so we want to build for where that’s going and not where it’s been

И мы хотим строить туда, куда это идёт, а не туда, где это было

swyx: a lot of cool stuff. And Okay. I have a couple more.

swyx: много крутого. И окей. У меня ещё пара.

Full Stack Hosting Debate

Спор о full-stack хостинге

swyx: So observations about the design elements about these things. One of the things that I’m always thinking about is cursor and other peers of cursor start from like the Devrel tools and work their way towards cloud agents.

swyx: Так что — наблюдения по дизайну этих штук. Одна из вещей, о которых я всегда думаю — Cursor и другие peers Cursor начинают с DevEx-инструментов и движутся к облачным агентам.

Other people, like the lovable and bolts of the world start with here’s like the vibe code. Full cloud thing. They were already cloud edges before anyone else cloud edges and we will give you the full deploy platform. So we own the whole loop. We own all the infrastructure, we own, we, we have the logs, we have the the live site, [00:35:00] whatever.

Другие люди, типа Lovable и Bolt этого мира, начинают с «вот vibe code — full cloud thing». Они уже были cloud agents до того, как кто-то другой стал cloud agents, и мы дадим тебе полную deploy-платформу. То есть мы владеем всем циклом. Мы владеем всей инфрой, у нас есть логи, у нас есть живой сайт, [00:35:00] что угодно.

And you can do that cycle cursor doesn’t own that cycle even today. You don’t have the versal, you don’t have the, you whatever deploy infrastructure that, that you’re gonna have, which gives you powers because anyone can use it. And any enterprise who, whatever you infra, I don’t care. But then also gives you limitations as to how much you can actually fully debug end to end.

И ты можешь крутить этот цикл. Cursor не владеет этим циклом даже сегодня. У тебя нет Vercel, у тебя нет — нет своей deploy-инфры. Это даёт тебе силы, потому что любой может это использовать, и любой enterprise — какую бы у тебя ни была infra, мне всё равно. Но также даёт ограничения — насколько вы можете реально полностью отладить end-to-end.

I guess I’m just putting out there that like is there a future where there’s like full stack cursor where like cursor apps.com where like I host my cursor site this, which is basically a verse clone, right? I don’t know.

Я просто это выкладываю — есть ли будущее, где будет full-stack Cursor, типа cursorapps.com, где я хостю свой Cursor-сайт, что по сути клон Vercel? Не знаю.

Jonas: I think that’s a interesting question to be asking, and I think like the logic that you laid out for how you would get there is logic that I largely agree with.

Jonas: Я думаю, это интересный вопрос для постановки, и логика, которую ты выложил, как ты бы туда пришёл — это логика, с которой я в большой степени согласен.

swyx: Yeah. Yeah.

swyx: Да. Да.

Jonas: I think right now we’re really focused on what we see as the next big bottleneck and because things like the Datadog MCP exist, yeah. I don’t think that the best way we can help our customers ship more software. Is by building a hosting solution right now,

Jonas: Я думаю, прямо сейчас мы реально сфокусированы на том, что мы видим как следующее большое бутылочное горлышко, и поскольку существуют такие вещи, как Datadog MCP, я не думаю, что лучший способ помочь нашим клиентам выкатывать больше софта — это построить hosting solution прямо сейчас.

swyx: by the way, these are things I’ve actually discussed with some of the companies I just named.

swyx: Кстати, это вещи, которые я обсуждал с некоторыми из компаний, которые я только что назвал.

Jonas: Yeah, for sure. Right now, just this big bottleneck is getting the code out there and also [00:36:00] unlike a lovable in the bolt, we focus much more on existing software. And the zero to one greenfield is just a very different problem. Imagine going to a Shopify and convincing them to deploy on your deployment solution.

Jonas: Да, конечно. Прямо сейчас большое бутылочное горлышко — это вывести код туда, и [00:36:00] в отличие от Lovable и Bolt, мы фокусируемся гораздо больше на существующем софте. И zero-to-one greenfield — это совсем другая проблема. Представь, как пойдёшь к Shopify и будешь убеждать их деплоить на твоём deployment solution.

That’s very different and I think will take much longer to see how that works. May never happen relative to, oh, it’s like a zero to one app.

Это очень другое и, я думаю, гораздо дольше будет видно, как это работает. Может никогда не случиться — относительно «о, это zero-to-one app».

swyx: I’ll say. It’s tempting because look like 50% of your apps are versal, superb base tailwind react it’s the stack. It’s what everyone does.

swyx: Скажу — это соблазнительно, потому что слушай, типа 50% твоих приложений — это Vercel, Supabase, Tailwind, React. Это стек. Это то, что все делают.

So I it’s kinda interesting.

Так что — это типа интересно.

Jonas: Yeah.

Jonas: Да.

Model Choice and Auto Routing

Выбор модели и авто-роутинг

swyx: The other thing is the model select dying. Right now in cloud agents, it’s stuck down, bottom left. Sure it’s Codex High today, but do I care if it’s suddenly switched to Opus? Probably not.

swyx: Другая штука — выбор модели. Прямо сейчас в облачных агентах он засунут вниз слева. Конечно, сегодня это Codex High, но мне важно, если он внезапно переключится на Opus? Наверное, нет.

Samantha: We definitely wanna give people a choice across models because I feel like it, the meta change is very frequently.

Samantha: Мы точно хотим давать людям выбор по моделям, потому что мета меняется очень часто.

I was a big like Opus 4.5 Maximalist, and when codex 5.3 came out, I hard, hard switch. So that’s all I use now.

Я была большой Opus 4.5 максималисткой, а когда вышел Codex 5.3, я резко переключилась. Так что теперь использую только его.

swyx: Yeah. Agreed. I don’t know if, but basically like when I use it in Slack, [00:37:00] right? Cursor does a very good job of exposing yeah. Cursors. If people go use it, here’s the model we’re using.

swyx: Да. Согласен. Не знаю, но в основном — когда я использую его в Slack, [00:37:00] Cursor очень хорошо это показывает. Да, Cursor. Если люди пойдут использовать, вот модель, которую мы используем.

Yeah. Here’s how you switch if you want. But otherwise it’s like extracted away, which is like beautiful because then you actually, you should decide.

Да. Вот как переключиться, если хочешь. Но иначе это типа абстрагировано — что красиво, потому что тогда реально, ты должен сам решать.

Jonas: Yeah, I think we want to be doing more with defaults.

Jonas: Да, я думаю, мы хотим делать больше с дефолтами.

swyx: Yeah.

swyx: Да.

Jonas: Where we can suggest things to people. A thing that we have in the editor, the desktop app is auto, which will route your request and do things there.

Jonas: Где мы можем предлагать людям. У нас есть в редакторе, в desktop-приложении, Auto, который роутит твой запрос и делает там разные вещи.

So I think we will want to do something like that for cloud agents as well. We haven’t done it yet. And so I think. We have both people like Sam, who are very savvy and want know exactly what model they want, and we also have people that want us to pick the best model for them because we have amazing people like Sam and we, we are the experts.

Думаю, мы захотим сделать что-то похожее и для облачных агентов. Пока не сделали. И поэтому у нас есть люди как Sam, которые очень подкованные и хотят точно знать, какую модель хотят, и также есть люди, которые хотят, чтобы мы выбрали лучшую модель для них — потому что у нас есть удивительные люди как Sam, и мы — мы эксперты.

Yeah. We have both the traffic and the internal taste and experience to know what we think is best.

Да. У нас есть и трафик, и внутренний вкус, и опыт, чтобы знать, что мы считаем лучшим.

swyx: Yeah. I have this ongoing pieces of agent lab versus model lab. And to me, cursor and other companies are example of an agent lab that is, building a new playbook that is different from a model lab where it’s like very GP heavy Olo.

swyx: Да. У меня есть текущий тейк — agent lab vs model lab. И для меня Cursor и другие компании — это пример agent lab, который строит новый playbook, отличный от model lab, где это очень GPU-тяжёлое.

So obviously has a research [00:38:00] team. And my thesis is like you just, every agent lab is going to have a router because you’re going to be asked like, what’s what. I don’t keep up to every day. I’m not a Sam, I don’t keep up every day for using you as sample the arm arbitrator of taste. Put me on CRI Auto.

У вас, очевидно, есть research-команда [00:38:00]. И моя теза: каждый agent lab будет иметь router, потому что тебя будут спрашивать — какой… я не успеваю каждый день. Я не Sam, я не успеваю каждый день. Если использовать тебя как arbiter вкуса — поставь меня на Cursor Auto.

Is it free? It’s not free.

Это бесплатно? Не бесплатно.

Jonas: Auto’s not free, but there’s different pricing tiers. Yeah.

Jonas: Auto не бесплатно, но есть разные тарифные планы. Да.

swyx: Put me on Chris. You decide from me based on all the other people you know better than me. And I think every agent lab should basically end up doing this because that actually gives you extra power because you like people stop carrying or having loyalty with one lab.

swyx: Поставь меня на Cursor, решай за меня на основе всех других людей, которых ты знаешь лучше, чем меня. И я думаю, каждый agent lab должен по сути в итоге это делать, потому что это реально даёт тебе экстра-силу — люди перестают иметь лояльность к одному лабу.

Jonas: Yeah.

Jonas: Да.

Best Of N and Model Councils

Best-of-N и model councils

Jonas: Two other maybe interesting things that I don’t know how much they’re on your radar are one the best event thing we mentioned where running different models head to head is actually quite interesting because

Jonas: Две другие, может быть, интересные штуки, не знаю, насколько они у тебя на радаре. Одна — best-of-N штука, которую мы упоминали — запуск разных моделей head-to-head на самом деле довольно интересен, потому что

swyx: which exists in cursor.

swyx: которая существует в Cursor.

Jonas: That exists in cur ID and web. So the problem is where do you run them?

Jonas: Это существует и в Cursor IDE, и в Web. Так что проблема — где их запускать?

swyx: Okay.

swyx: Окей.

Jonas: And so I, I can share my screen if that’s interesting. Yeahinteresting.

Jonas: И я могу расшарить экран, если интересно. Да-интересно.

swyx: Yeah. Yeah. Obviously parallel agents, very popal.

swyx: Да. Да. Очевидно, parallel agents очень популярны.

Jonas: Yes, exactly. Parallel agents

Jonas: Да, именно. Parallel agents.

swyx: in you mind. Are they the same thing? Best event and parallel agents? I don’t want to [00:39:00] put words in your mouth.

swyx: в твоём ментальном восприятии — это одно и то же? Best-of-N и parallel agents? Не хочу за тебя слова вставлять [00:39:00].

Jonas: Best event is a subset of parallel agents where they’re running on the same prompt.

Jonas: Best-of-N — это подмножество parallel agents, где они работают на одном промте.

That would be my answer. So this is what that looks like. And so here in this dropdown picker, I can just select multiple models.

Это был бы мой ответ. Вот как это выглядит. Здесь в этом дропдауне я могу просто выбрать несколько моделей.

swyx: Yeah.

swyx: Да.

Jonas: And now if I do a prompt, I’m going to do something silly. I am running these five models.

Jonas: И теперь, если я делаю промт — я сделаю что-то глупое — я запускаю эти пять моделей.

swyx: Okay. This is this fake clone, of course. The 2.0 yeah.

swyx: Окей. Это фейк-клон, конечно. 2.0, да.

Jonas: Yes, exactly. But they’re running so the cursor 2.0, you can do desktop or cloud.

Jonas: Да, именно. Но они запускаются — так что Cursor 2.0, ты можешь делать desktop или cloud.

So this is cloud specifically where the benefit over work trees is that they have their own VMs and can run commands and won’t try to kill ports that the other one is running. Which are some of the pains. These are all

Это cloud конкретно, где преимущество над worktrees в том, что у них собственные VM, они могут запускать команды и не будут пытаться убить порты, на которых работает другой. Это некоторые из болей. Это все

swyx: called work trees?

swyx: называются worktrees?

Jonas: No, these are all cloud agents with their own VMs.

Jonas: Нет, это всё облачные агенты со своими VM.

swyx: Okay. But

swyx: Окей. Но

Jonas: When you do it locally, sometimes people do work trees and that’s been the main way that people have set out parallel so far.

Jonas: когда делаешь это локально, иногда люди делают worktrees, и это был основной способ, которым люди до сих пор настраивали parallel.

I’ve gotta say.

Должен сказать.

swyx: That’s so confusing for folks.

swyx: Это так запутывает людей.

Jonas: Yeah.

Jonas: Да.

swyx: No one knows what work trees are.

swyx: Никто не знает, что такое worktrees.

Jonas: Exactly. I think we’re phasing out work trees.

Jonas: Именно. Я думаю, мы фазим worktrees.

swyx: Really.

swyx: Реально?

Jonas: Yeah.

Jonas: Да.

swyx: Okay.

swyx: Окей.

Samantha: But yeah. And one other thing I would say though on the multimodel choice, [00:40:00] so this is another experiment that we ran last year and the decide to ship at that time but may come back to, and there was an interesting learning that’s relevant for, these different model providers. It was something that would run a bunch of best of ends but then synthesize and basically run like a synthesizer layer of models. And that was other agents that would take LM Judge, but one that was also agentic and could write code. So it wasn’t just picking but also taking the learnings from two models or, and models that it was looking at and writing a new diff.

Samantha: Но да. И ещё одна вещь, которую я бы сказала по поводу мультимодельного выбора — [00:40:00] это ещё один эксперимент, который мы запустили в прошлом году и решили на тот момент не релизить, но можем вернуться. И там было интересное обучение, релевантное для разных провайдеров моделей. Это была штука, которая запускала кучу best-of-N, а потом синтезировала и по сути запускала слой моделей-синтезаторов. И это были другие агенты, которые брали LLM Judge, но этот тоже был агентным и мог писать код. Так что он не просто выбирал, но и брал выводы из двух или N моделей, на которые смотрел, и писал новый diff.

And what we found was that at the time at least, there were strengths to using models from different model providers as the base level of this process. Like basically you could get almost like a synergistic output that was better than having a very unified, like bottom model tier. So it was really interesting ‘cause it’s like potentially, even though even in the future when you have like maybe one model as ahead of the other for a little bit, there could be some benefit from having like multiple top tier models involved in like a [00:41:00] model swarm or whatever agent Swarm that you’re doing, that they each have strengths and weaknesses.

И что мы обнаружили: на тот момент по крайней мере были преимущества от использования моделей разных провайдеров в качестве базового уровня этого процесса. Типа, по сути, можно было получить почти синергетический выход, лучший, чем при унифицированном [00:41:00] нижнем тире моделей. Это было реально интересно — потому что даже в будущем, когда у тебя одна модель впереди других на какое-то время, может быть выгода от того, что несколько top-tier моделей вовлечены в model swarm или agent swarm — что у них у каждой свои сильные и слабые стороны.

Yeah.

Да.

Jonas: Andre called this the council, right?

Jonas: Andrej называл это «совет», да?

Samantha: Yeah, exactly. We actually, oh, that’s another internal command we have that Ian wrote slash council. Oh, and they some, yeah.

Samantha: Да, именно. У нас есть другая внутренняя команда, которую Ian написал — /council. О, и они… да.

swyx: Yes. This idea is in various forms everywhere. And I think for me, like for me, the productization of it, you guys have done yeah, like this is very flexible, but.

swyx: Да. Эта идея в разных формах везде. И для меня — для меня продуктизация, которую вы сделали, да, очень гибкая, но.

If I were to add another Yeah, what your thing is on here it would be too much. I what, let’s say,

Если бы я добавил ещё одно — да, твоё, это было бы слишком много. Я имею в виду, скажем,

Samantha: Ideally it’s all, it’s something that the user can just choose and it all happens under the hood in a way where like you just get the benefit of that process at the end and better output basically, but don’t have to get too lost in the complexity of judging along the way.

Samantha: В идеале — это что-то, что пользователь может просто выбрать, и всё происходит под капотом так, что ты просто получаешь выгоду от этого процесса в конце и лучший вывод, по сути, не теряясь в сложности промежуточного судейства.

Jonas: Okay.

Jonas: Окей.

Subagents for Context

Subagents для контекста

Jonas: Another thing on the many agents, on different parallel agents that’s interesting is an idea that’s been around for a while as well that has started working recently is subagents. And so this is one other way to get agents of the different prompts and different goals and different models, [00:42:00] different vintages to work together.

Jonas: Ещё одна штука про многих агентов, про разных parallel-агентов, которая интересна — это идея, которая существует уже какое-то время и недавно начала работать — subagents. И это другой способ заставить агентов разных промтов, разных целей и разных моделей, [00:42:00] разных vintages работать вместе.

Collaborate and delegate.

Коллаборировать и делегировать.

swyx: Yeah. I’m very like I like one of my, I always looking for this is the year of the blah, right? Yeah. I think one of the things on the blahs is subs. I think this is of but I haven’t used them in cursor. Are they fully formed or how do I honestly like an intro because do I form them from new every time?

swyx: Да. Я очень — одна из моих, я всегда ищу — «это год того-то», верно? Да. Я думаю, одно из «того-то» — это subagents. Я думаю, это «то», но я не использовал их в Cursor. Они полностью сформированы или как? Честно говоря, нужен intro. Я каждый раз формирую их заново?

Do I have fixed subagents? How are they different for slash commands? There’s all these like really basic questions that no one stops to answer for people because everyone’s just like too busy launching. We have to

У меня фиксированные subagents? Чем они отличаются от slash-команд? Есть все эти реально базовые вопросы, на которые никто не останавливается ответить, потому что все слишком заняты лонч-ами. Нам нужно

Samantha: honestly, you could, you can see them in cursor now if you just say spin up like 50 subagents to, so cursor defines

Samantha: честно, ты можешь — можно увидеть их в Cursor сейчас, если просто сказать «подними 50 subagents, чтобы…» Cursor определяет

swyx: what Subagents.

swyx: что такое subagents.

Yeah.

Да.

Samantha: Yeah. So basically I think I shouldn’t speak for the whole subagents team. This is like a different team that’s been working on this, but our thesis or thing that we saw internally is that like they’re great for context management for kind of long running threads, or if you’re trying to just throw more compute at something.

Samantha: Да. По сути, я не должна говорить за всю subagents команду — это другая команда, которая над этим работала, но наш тезис или то, что мы видели внутри — они отличные для управления контекстом для долгих тредов или если ты просто пытаешься бросить больше compute на что-то.

We have strongly used, almost like a generic task interface where then the main agent can define [00:43:00] like what goes into the subagent. So if I say explore my code base, it might decide to spin up an explore subagent and or might decide to spin up five explore subagent.

Мы сильно использовали почти что generic task interface, где главный агент может определить, [00:43:00] что входит в subagent. Так что если я говорю «исследуй мой кодбейс», он может решить поднять один explore-subagent или может решить поднять пять explore-subagents.

swyx: But I don’t get to set what those subagent are, right?

swyx: Но я не выбираю, какие это subagents, да? Всё определяется моделью.

It’s all defined by a model.

Samantha: Я думаю — мне надо освежить себя по subagent-интерфейсу.

Samantha: I think. I actually would have to refresh myself on the sub agent interface.

Jonas: Есть встроенные — explore-subagent встроен бесплатно. Но ты также можешь инструктировать модель использовать другие subagents, и она это сделает. Ещё один пример встроенного subagent — я только что запустил один в Cursor, могу показать, как это выглядит.

Jonas: There are some built-in ones like the explore subagent is free pre-built. But you can also instruct the model to use other subagents and then it will. And one other example of a built-in subagent is I actually just kicked one off in cursor and I can show you what that looks like.

swyx: Да. Потому что я пытался это сделать в чистом prompt-пространстве.

swyx: Yes. Because I tried to do this in pure prompt space.

Jonas: Это desktop-приложение? Да. Да. И это

Jonas: So this is the desktop app? Yeah. Yeah. And that’s

swyx: и всё, что нужно сделать, да? Да.

swyx: all you need to do, right? Yeah.

Jonas: Это всё, что нужно сделать. Так что я сказал «используй subagent, чтобы исследовать», и я думаю — да, я даже могу кликнуть и посмотреть, над чем работает subagent. Он запустил какую-то find-команду, и это Composer под капотом.

Jonas: That’s all you need to do. So I said use a sub agent to explore and I think, yeah, so I can even click in and see what the subagent is working on here. It ran some fine command and this is a composer under the hood.

Хотя моя основная модель — Opus, он делает smart-роутинг — в этом случае explore требует читать тонны вещей. И поэтому более быстрая модель реально полезна, чтобы получить [00:44:00] ответ быстро. Вот как выглядят subagents. И мы хотим много сделать, чтобы выставить хуки и способы для людей конфигурировать.

Even though my main model is Opus, it does smart routing to take, like in this instance the explorer sort of requires reading a ton of things. And so a faster model is really useful to get an [00:44:00] answer quickly, but that this is what subagent look like. And I think we wanted to do a lot more to expose hooks and ways for people to configure these.

Другой пример Cursor-овского встроенного subagent — computer-use subagent в облачных агентах, где мы обнаружили, что эти траектории могут быть длинными и включать много изображений и выполнение какой-то тестирующей задачи верификации. Мы хотели использовать модели, особенно хорошие в этом.

Another example of a cus sort of builtin subagent is the computer use subagent in the cloud agents, where we found that those trajectories can be long and involve a lot of images obviously, and execution of some testing verification task. We wanted to use that models that are particularly good at that.

Это одна причина использовать subagents. И другая причина использовать subagents — мы хотим, чтобы контекст был саммаризован — сжат на уровне subagent. Это реально хорошая граница, на которой можно сжать этот rollout и тестирование в финальное сообщение, которое агент пишет, которое потом передаётся родителю — вместо того чтобы делать какой-то глобальный compaction или что-то такое.

So that’s one reason to use subagents. And then the other reason to use subagents is we want contexts to be summarized reduced down at a subagent level. That’s a really neat boundary at which to compress that rollout and testing into a final message that agent writes that then gets passed into the parent rather than having to do some global compaction or something like that.

swyx: Потрясающе. Круто. Раз уж мы в разговоре про subagents — не могу провести разговор про Cursor и не поговорить про lithium-штуки. Что это? Что такое? Он построил браузер. Он построил OS. Да. И [00:45:00] он экспериментировал с кучей разных архитектур и в итоге переизобрёл org-чарт software engineer.

swyx: Awesome. Cool. While we’re in the subagents conversation, I can’t do a cursor conversation and not talk about listen stuff. What is that? What is what? He built a browser. He built an os. Yes. And he [00:45:00] experimented with a lot of different architectures and basically ended up reinventing the software engineer org chart.

Всё это круто, но какой твой тейк? Какие, есть ли закулисные истории про всё это приключение?

This is all cool, but what’s your take? What’s, is there any hole behind the side? The scenes stories about that kind of, that whole adventure.

Samantha: Некоторые из этих экспериментов нашли путь в фичу, доступную в облачных агентах сейчас — long-running agent mode, внутри мы называем его grind mode.

Samantha: Some of those experiments have found their way into a feature that’s available in cloud agents now, the long running agent mode internally, we call it grind mode.

И я думаю, какой-то намёк на grind mode доступен в пикере сегодня. Потому что можно выбрать «grind until done». И это был результат экспериментов, которые Wilson начал в этом ключе — я думаю, Ralph Wigga loop в то время как раз летал, но он также независимо нашёл и экспериментировал.

And I think there’s like some hint of grind mode accessible in the picker today. ‘cause you can do choose grind until done. And so that was really the result of experiments that Wilson started in this vein where he I think the Ralph Wigga loop was like floating around at the time, but it was something he also independently found and he was experimenting with.

И это привело к этой продуктовой поверхности.

And that was what led to this product surface.

swyx: И это просто простая идея — иметь критерии завершения и не останавливаться, пока не завершишь.

swyx: And it is just simple idea of have criteria for completion and do not. Until you complete,

Samantha: there’s a bit more complexity as well in, in our implementation. Like there’s a specific, you have to start out by aligning and there’s like a planning stage where it will work with you and it will not get like start grind execution mode until it’s decided that the [00:46:00] plan is amenable to both of you.

Samantha: Есть чуть больше сложности в нашей имплементации. Есть специфическое — ты должен начать с согласования, и есть planning stage, где он будет работать с тобой и не начнёт grind execution mode, пока не решит, что план приемлем для вас обоих [00:46:00].

Basically,

По сути,

swyx: I refuse to work until you make me happy.

swyx: Я отказываюсь работать, пока ты меня не порадуешь.

Jonas: We found that it’s really important where people would give like very underspecified prompt and then expect it to come back with magic. And if it’s gonna go off and work for three minutes, that’s one thing. When it’s gonna go off and work for three days, probably should spend like a few hours upfront making sure that you have communicated what you actually want.

Jonas: Мы обнаружили, что это реально важно — люди давали очень недоспецифицированные промты и потом ожидали, что он вернётся с магией. И если он уйдёт работать на три минуты — это одно. Когда он уйдёт работать на три дня — наверное, стоит провести несколько часов вверху, убедившись, что ты сообщил, чего реально хочешь.

swyx: Yeah. And just to like really drive from the point. We really mean three days that No, no

swyx: Да. И просто чтобы реально подчеркнуть — мы реально имеем в виду три дня, что — никакого

Jonas: human. Oh yeah. We’ve had three day months innovation whatsoever.

Jonas: человека. О да. У нас были и трёхдневные. Какое-то вмешательство — никакого.

Samantha: I don’t know what the record is, but there’s been a long time with the grants

Samantha: Не знаю, какой рекорд, но было долго с grand’ом

Jonas: and so the thing that is available in cursor. The long running agent is if you wanna think about it, very abstractly that is like one worker node.

Jonas: И штука, доступная в Cursor — long-running агент, если думать о нём очень абстрактно — это как одна worker node.

Whereas what built the browser is a society of workers and planners and different agents collaborating. Because we started building the browser with one worker node at the time, that was just the agent. And it became one worker node when we realized that the throughput of the system was not where it needed to be [00:47:00] to get something as large of a scale as the browser done.

Тогда как то, что построило браузер — это общество воркеров и плэннеров и разных коллаборирующих агентов. Потому что мы начали строить браузер с одной worker node на тот момент — это был просто агент. И это стало одной worker node, когда мы поняли, что throughput системы — не там, где должен быть [00:47:00], чтобы сделать что-то такого масштаба, как браузер.

swyx: Yeah.

swyx: Да.

Jonas: And so this has also become a really big mental model for us with cloud, cloud agents is there’s the classic engineering latency throughput trade-offs. And so you know, the code is water flowing through a pipe. The, we think that over the coming months, the big unlock is not going to be one person with a model getting more done, like the water flowing faster and we’ll be making the pipe much wider and so ing more, whether that’s swarms of agents or parallel agents, both of those are things that contribute to getting.

Jonas: И это тоже стало реально большой ментальной моделью для нас с cloud, cloud-агентами — есть классические инженерные latency-throughput trade-offs. Код — это вода, текущая через трубу. Мы считаем, что в ближайшие месяцы большой прорыв будет не в том, что один человек с моделью успевает больше — то есть вода течёт быстрее. Мы будем делать трубу гораздо шире и так распараллеливать больше — будь то рои агентов или parallel-агенты — и то и другое способствует тому, чтобы…

Much more done in the same amount of time, but any one of those tasks doesn’t necessarily need to get done that quickly. And throughput is this really big thing where if you see the system of a hundred concurrent agents outputting thousands of tokens a second, you can’t go back like that.

успевать гораздо больше за то же время. Но любая конкретная из этих задач не обязательно должна быть выполнена так быстро. И throughput — это реально большая штука, где если ты видишь систему из ста concurrent-агентов, выдающих тысячи токенов в секунду — ты не можешь вернуться. Просто… ты видишь проблеск будущего, где, очевидно, много caveats. Никто не использует этот браузер IRL. Есть куча вещей, не совсем правильных пока, но мы дойдём до систем, которые производят реальный production-код [00:48:00] на этом масштабе гораздо скорее, чем люди думают. И это заставляет тебя думать — что вообще случится с production-системами. У нас недавно сломались GitHub Actions, потому что у нас столько агентов производят и пушат код, что CI/CD просто перегружен. Потому что внезапно — мы по сути выросли. Cursor растёт очень быстро в любом случае, но ты растишь headcount в 10 раз, когда люди запускают в 10 раз больше агентов.

Just you see a glimpse of the future where obviously there are many caveats. Like no one is using this browser. IRL. There’s like a bunch of things not quite right yet, but we are going to get to systems that produce real production [00:48:00] code at the scale much sooner than people think. And it forces you to think what even happens to production systems. Like we’ve broken our GitHub actions recently because we have so many agents like producing and pushing code that like CICD is just overloaded. ‘cause suddenly it’s like effectively weg grew, cursor’s growing very quickly anyway, but you grow head count, 10 x when people run 10 x as many agents.

И многие из этих систем — именно, многие из этих систем должны будут адаптироваться.

And so a lot of these systems, exactly, a lot of these systems will need to adapt.

swyx: Это также напоминает — мы все трое живём в app-слое, но если поговорить с исследователями, делающими RL-инфру — это то же самое. Это все эти parallel rollouts и их шедулинг, и чтобы как можно больше throughput шло через них.

swyx: It also reminds me, we, we all, the three of us live in the app layer, but if you talk to the researchers who are doing RL infrastructure, it’s the same thing. It’s like all these parallel rollouts and scheduling them and making sure as much throughput as possible goes through them.

Да, то же самое.

Yeah, it’s the same thing.

Jonas: Мы говорили вкратце перед записью. Ты упоминал чипы памяти и некоторый shortage там. Другая штука, которую просто сложно охватить умом — масштаб системы, которая строила браузер, concurrency там.

Jonas: We were talking briefly before we started recording. You were mentioning memory chips and some of the shortages there. The other thing that I think is just like hard to wrap your head around the scale of the system that was building the browser, the concurrency there.

Если у Sam и меня обоих будет такая система, [00:49:00] выкатывающая наш софт. Количество inference, которое нам понадобится на разработчика — просто умопомрачительно. И это делает — иногда, когда я об этом думаю, я думаю, что даже при самых оптимистичных прогнозах того, что нам понадобится по buildout, мы недооцениваем, до какой степени эти swarm-системы могут churn-ить на масштабе и производить код, ценный для экономики.

If Sam and I both have a system like that running for us, [00:49:00] shipping our software. The amount of inference that we’re going to need per developer is just really mind-boggling. And that makes, sometimes when I think about that, I think that even with, the most optimistic projections for what we’re going to need in terms of buildout, our underestimating, the extent to which these swarm systems can like churn at scale to produce code that is valuable to the economy.

И,

And,

swyx: да, можешь вырезать, если это чувствительно, но я просто — есть оценки того, сколько у вас токеновое потребление?

swyx: yeah, you can cut this if it’s sensitive, but I was just Do you have estimates of how much your token consumption is?

Jonas: Типа на разработчика?

Jonas: Like per developer?

swyx: Да. Или у тебя самого. Не надо comfy-average. Просто любопытно.

swyx: Yeah. Or yourself. I don’t need like comfy average. I just curious. I

Samantha: Я какое-то время не была админом в дашборде usage, так что не могла видеть, но это был…

Samantha: feel like I, for a while I wasn’t an admin on the usage dashboard, so I like wasn’t able to actually see, but it was a,

swyx: мой вырос.

swyx: mine has gone up.

Samantha: О да.

Samantha: Oh yeah.

swyx: Но я думаю

swyx: But I think

Samantha: в плане того, сколько работы я делаю — скорее так: у меня нет беспокойств про разработчиков, теряющих работу, по крайней мере в ближайшей перспективе. Потому что это более широкое обсуждение.

Samantha: it’s in terms of how much work I’m doing, it’s more like I have no worries about developers losing their jobs, at least in the near term. ‘cause I feel like that’s a more broad discussion.

swyx: Да. Да. Ты пошла туда. Я не шёл, я туда не шёл. Я просто — сколько больше ты используешь?

swyx: Yeah. Yeah. You went there. I didn’t go, I wasn’t going there.

I was just like how much more are you using?

Samantha: There’s so much stuff to be built. And so I feel like I’m basically just [00:50:00] trying to constantly I have more ambitions than I did before. Yes. Personally. Yes. So can’t speak to the broader thing. But for me it’s like I’m busier than ever before.

Samantha: Так много всего надо построить. И поэтому я по сути просто [00:50:00] постоянно — у меня больше амбиций, чем было раньше. Да. Лично. Да. Не могу говорить за всё. Но для меня это: я занята как никогда раньше.

I’m using more tokens and I am also doing more things.

Я использую больше токенов и я также делаю больше вещей.

Jonas: Yeah. Yeah. I don’t have the stats for myself, but I think broadly a thing that we’ve seen, that we expect to continue is J’S paradox. Where

Jonas: Да. Да. У меня нет статистики на себя, но я думаю, в широком смысле — то, что мы видели, и что, как мы ожидаем, продолжится — это парадокс Джевонса.

swyx: you can’t do it in our podcast without seeing

swyx: Без этого нельзя в нашем подкасте

Jonas: it. Exactly. We’ve done it. Now we can wrap. We’ve done, we said the words.

Jonas: именно. Мы это сделали. Теперь можем заканчивать. Мы сказали слова.

Phase one tab auto complete people paid like 20 bucks a month. And that was great. Phase two where you were iterating with these local models. Today people pay like hundreds of dollars a month. I think as we think about these highly parallel kind of agents running off for a long times in their own VM system, we are already at that point where people will be spending thousands of dollars a month per human, and I think potentially tens of thousands and beyond, where it’s not like we are greedy for like capturing more money, but what happens is just individuals get that much more leverage.

Фаза один — tab autocomplete — люди платили по 20 баксов в месяц. И это было отлично. Фаза два, где ты итерируешь с этими локальными моделями. Сегодня люди платят по сотни долларов в месяц. Я думаю, когда мы думаем об этих highly parallel agents, работающих долго в своих собственных VM-системах, мы уже в точке, где люди будут тратить тысячи долларов в месяц на человека, и я думаю, потенциально десятки тысяч и больше — это не то что мы жадные до захвата большего денег, но что происходит — индивидуумы получают на столько больше leverage.

And if one person can do as much as 10 people, yeah. That tool that allows ‘em to do that is going to be tremendously valuable [00:51:00] and worth investing in and taking the best thing that exists.

И если один человек может делать столько, сколько 10 — да, этот инструмент, который позволяет это, будет невероятно ценным [00:51:00] и стоит инвестиций — и того, чтобы брать лучшее существующее.

swyx: One more question on just the cursor in general and then open-ended for you guys to plug whatever you wanna put.

swyx: Ещё один вопрос про Cursor в целом, и потом open-ended для вас, чтобы вы плагнули, что хотите.

How is Cursor hiring these days?

Как Cursor нанимает в эти дни?

Samantha: What do you mean by how?

Samantha: Что значит «как»?

swyx: So obviously lead code is dead. Oh,

swyx: Очевидно, LeetCode мёртв.

Samantha: okay.

Samantha: О, окей.

swyx: Everyone says work trial. Different people have different levels of adoption of agents. Some people can really adopt can be much more productive. But other people, you just need to give them a little bit of time.

swyx: Все говорят «work trial». Разные люди имеют разный уровень адопции агентов. Некоторые люди могут реально адоптировать — могут быть гораздо продуктивнее. Но другим людям нужно немного времени.

And sometimes they’ve never lived in a token rich place like cursor.

И иногда они никогда не жили в token-rich месте, как Cursor.

And once you live in a token rich place, you’re you just work differently. But you need to have done that. And a lot of people anyway, it was just open-ended. Like how has agentic engineering, agentic coding changed your opinions on hiring?

И как только живёшь в token-rich месте — ты просто работаешь по-другому. Но нужно это сделать. И много людей — в любом случае, это open-ended. Как агентная разработка, agentic coding изменили твои мнения по найму?

Is there any like broad like insights? Yeah.

Есть какие-то широкие инсайты? Да.

Jonas: Basically I’m asking this for other people, right? Yeah, totally. Totally. To hear Sam’s opinion, we haven’t talked about this the two of us. I think that we don’t see necessarily being great at the latest thing with AI coding as a prerequisite.

Jonas: По сути, я спрашиваю это для других людей, верно? Да, абсолютно. Абсолютно. Чтобы услышать мнение Sam — мы вдвоём об этом не говорили. Я думаю, мы не видим обязательно крутость в последней штуке с AI-coding как prerequisite.

I do think that’s a sign that people are keeping up and [00:52:00] curious and willing to upscale themselves in what’s happening because. As we were talking about the last three months, the game has completely changed. It’s like what I do all day is very different.

Я думаю, это знак, что люди следят и [00:52:00] любопытны и готовы апскейлить себя в происходящем — потому что, как мы говорили — последние три месяца, игра полностью изменилась. То, что я делаю весь день, совсем другое.

swyx: Like it’s my job and I can’t,

swyx: Это моя работа, и я не могу,

Jonas: Yeah, totally.

Jonas: Да, абсолютно.

I do think that still as Sam was saying, the fundamentals remain important in the current age and being able to go and double click down. And models today do still have weaknesses where if you let them run for too long without cleaning up and refactoring, the coke will get sloppy and there’ll be bad abstractions.

Я думаю, до сих пор, как говорила Sam, фундаменты остаются важны в текущей эре, и способность пойти и кликнуть в детали. Модели сегодня всё ещё имеют слабости — если позволять им работать слишком долго без уборки и рефакторинга, код станет неряшливым, будут плохие абстракции.

And so you still do need humans that like have built systems before, no good patterns when they see them and know where to steer things.

И поэтому всё ещё нужны люди, которые строили системы раньше, знают хорошие паттерны, когда их видят, и знают, куда направлять.

Samantha: I would agree with that. I would say again, cursor also operates very quickly and leveraging ag agentic engineering is probably one reason why that’s possible in this current moment.

Samantha: Согласна с этим. Я бы сказала — снова, Cursor оперирует тоже очень быстро, и леверидж agentic engineering — наверное, одна из причин, почему это возможно в текущий момент.

I think in the past it was just like people coding quickly and now there’s like people who use agents to move faster as well. So it’s part of our process will always look for we’ll select for kind of that ability to make good decisions quickly and move well in this environment.

Я думаю, в прошлом это были просто люди, кодирующие быстро, а теперь есть люди, которые используют агентов, чтобы двигаться быстрее. Так что часть нашего процесса всегда будет искать — мы будем селектить за способность принимать хорошие решения быстро и хорошо двигаться в этой среде.

And so I think being able to [00:53:00] figure out how to use agents to help you do that is an important part of it too.

И я думаю, что способность [00:53:00] выяснить, как использовать агентов, чтобы помочь тебе это делать — важная часть.

swyx: Yeah. Okay. The fork in the road, either predictions for the end of the year, if you have any, or PUDs.

swyx: Да. Окей. Развилка — либо предсказания на конец года, если есть, либо плаги.

Jonas: Evictions are not going to go well.

Jonas: Предсказания не будут хорошими.

Samantha: I know it’s hard.

Samantha: Знаю, это тяжело.

swyx: They’re so hard. Get it wrong.

swyx: Они так сложны. Ошибёшься.

It’s okay. Just, yeah.

Окей. Просто, да.

Jonas: One other plug that may be interesting that I feel like we touched on but haven’t talked a ton about is a thing that the kind of these new interfaces and this parallelism enables is the ability to hop back and forth between threads really quickly. And so a thing that we have,

Jonas: Один другой плаг, который может быть интересен — мы его коснулись, но не говорили о нём много — это что эти новые интерфейсы и этот параллелизм позволяет — способность прыгать между тредами очень быстро. И штука, которая у нас есть,

swyx: you wanna show something or,

swyx: хочешь показать что-то или…

Jonas: yeah, I can show something.

Jonas: да, могу показать.

A thing that we have felt with local agents is this pain around contact switching. And you have one agent that went off and did some work and another agent that, that did something else. And so here by having, I just have three tabs open, let’s say, but I can very quickly, hop in here.

Штука, которую мы почувствовали с локальными агентами — это боль вокруг переключения контекста. У тебя один агент, который ушёл и сделал какую-то работу, и другой агент, который сделал что-то ещё. И здесь, имея — у меня просто три таба открыты, скажем, — я могу очень быстро запрыгнуть сюда.

This is an example I showed earlier, but the actual workflow here I think is really different in a way that may not be obvious, where, I start the morning, I kick off 10 agents or something, the first one of them [00:54:00] finishes, come in, watch the video either as close. And so I might send a follow up.

Это пример, который я показывал ранее, но реальный воркфлоу, я думаю, реально другой — может быть, не очевидно. Я начинаю утро, запускаю 10 агентов или около того, первый из них [00:54:00] заканчивает — захожу, смотрю видео, либо как закрываю. И могу отправить follow-up.

I might say, Hey, make it red, or I might hop into the desktop and try it out. And within, 90, 120 seconds, I’ve kicked this one back off. And either started the merge process like CI is running now and I’ll come back to it later or it’s off with some additional follow up information. And then I can hop into the next one.

Могу сказать «сделай красным», или могу запрыгнуть в desktop и попробовать. И за 90, 120 секунд я перезапустил этого, либо начал процесс мерджа — CI сейчас крутится, и я вернусь позже — либо он уехал с какой-то дополнительной follow-up информацией. И тогда я могу запрыгнуть в следующий.

And then the next one I hop in and I’m like, okay, this looks interesting. Actually try it out for real in the app. I want to see it in action, not just in the gallery. So I can kick that off and the agent will go and work on that because maybe I wanted to try it out, like what the button looks like in the actual thing.

И в следующий запрыгиваю и говорю: окей, это выглядит интересно. Давай реально попробую в приложении. Хочу увидеть в действии, не только в галерее. И я могу это запустить, и агент пойдёт работать над этим, потому что я, может быть, хотел попробовать — как кнопка выглядит в реальной штуке.

And then here I might hop in as well and, check the video here or do something. And so you’re really parallelizing much more and follow up here, check in there. It’s much more this higher level of abstraction and having the different desktops where you can hop back and forth and you’re [00:55:00] not like, oh, I checked out this branch.

И сюда я могу запрыгнуть и проверить видео тут или сделать что-то. И ты реально гораздо больше распараллеливаешь и фоллоу-апишь, проверяешь — это гораздо больше высокий уровень абстракции, имея разные десктопы, где можно прыгать туда-сюда, и ты не [00:55:00] «о, я checkout’нул эту ветку, о, где же был тот worktree снова?». Да. Это реально решает за то, с чем мы сами в Cursor и в этих локальных агентах боролись: «где же был тот diff снова?» — он потерян в каком-то worktree, никогда не найду. О, моя локалка ребилдится. О, просто сделай ещё один.

Oh, where was that work tree again? Yeah. It’s really like solving for that which we’ve ourselves have struggled with in cursor and these local agents to be like, where was that diff again? It’s lost in some work tree. Never gonna find it. Oh, my local thing is rebuilding. Oh, just make another one.

Вот к чему приходишь. И ты потом ждёшь ещё пять минут, пока запустится. Это реально новый способ параллелизации, который мы нашли честно очень весёлым. Да. Где ты просто прыгаешь и инжектируешь вкус, и говоришь: «это не совсем то».

That, that’s what you end up with and then you wait for five more minutes for it to run. And so this is really like a new way of just paralleling that we found to be really fun, honestly. Yeah. Where you’re just hopping in and injecting taste and you’re like that doesn’t quite feel right.

О, на самом деле это не совсем правильно сархитектурено — ты просто фокусируешься на этих интересных вопросах вкуса.

Oh, actually this is not architected quite right, but you’re just focusing on those like taste interesting questions.

Samantha: Для меня cloud-экосистема также позволила сделать это — добавлять продуктивность к моему мёртвому времени, типа коммьюту или ночью или такому. Тот факт, что мне не нужно оставлять компьютер открытым,

Samantha: For me, the cloud ecosystem too also enabled this to be like, something that is like adding productivity to my dead time, like commuting or like overnight or something like that.

swyx: у Cursor нет — есть ли мобильное приложение Cursor?

The fact that I don’t have to leave my computer open,

Samantha: Если есть, я не уверена. Это текущая штука — мы — я использую на своём телефоне всё время, просто через веб. Так что довольно хороший опыт там для чек-ина [00:56:00]. Да. И анлок. Я думаю, да, можно смотреть видео и прочее в web-app, что потрясающе.

swyx: there’s no cursor, there is a cursor mobile app.

Да.

Samantha: If there is, I’m not sure. It’s like the current thing. We, I use it on my phone all the time, just on the web. So pretty good experience there for checking [00:56:00] in. Yeah. And un unlocking. I think, yeah. You can see the videos and stuff in the web app, which is awesome.

Jonas: Да.

Yeah.

swyx: Я думаю, это та, где ADD унаследует землю, если у тебя attention span готов, но ты всё ещё можешь управлять — на самом деле это хорошо для тебя. Да. Но также я думаю — здесь coding-инструменты начинают сталкиваться с productivity-инструментами, где linear, kanban-доски — потому что то, что у тебя есть — это круто, но ты знаешь что, тебе на самом деле нужна kanban-доска. Которую люди vibe’ят — vibe kanban существует, open-source. Уверен, вы об этом говорили, но они начнут сталкиваться, потому что на самом деле код больше не важен.

Jonas: Yeah.

Это процесс взаимодействия и чек-ина человека. И вижу, типа, как заставить World of Warcraft sound package работать или что угодно. Типа «работа сделана» или, не знаю — интересная штука будущей продуктивности.

swyx: I think this is one that the a DD one inherited the earth, like the, if you’re like, your attention span is cooked, but you still can manage, like actually this is good for you. Yeah. But also I think this is where the coding tools start coming into conflict with the productivity tools where like the linear the canman boards, because what you have there is cool, but you know what, you actually need a cabin board. Which people have vibe, vibe, cam, van is out there. Open source. I’m sure you guys have talked about it, but we’ll start to conflict because actually the code doesn’t matter anymore.

Samantha: Да.

It’s the process of the human interacting and checking in. And seeing, like getting the world of warcrafts sound package to go like work or whatever. It’s like job done or, I don’t know. It’s like an interesting like future productivity thing.

swyx: Я также думаю, что ещё одна большая тема — типа в прошлом году называлась «год coding-агентов».

Samantha: Yeah.

В этом году другая — coding-агенты переливаются в реальный мир, в cloud cowork и всё другое. Да. Уверен, Cursor будет фокусироваться на софте, но скажем — open Claude крайне [00:57:00] mind-expanding в плане «я не знал, что такое может случиться».

swyx: I also think like another big theme like last year li is called like the, your coding agents.

Jonas: Да.

This year another like coding agents spill over to the real world into cloud cowork and all the other stuff. Yeah. I’m sure cursor’s gonna focus on software, but let’s call it like open claw is like extremely [00:57:00] mind expanding in terms of I did not know that could happen.

swyx: И это всё основано на coding-агенте — полностью.

Jonas: Yeah.

Jonas: И я думаю, одна из вещей, которая интересна — говоря с друзьями и семьёй, которые не в software-мире — я ускоряю предсказания. Я думаю, что мы начнём видеть, как другие индустрии проходят через то, через что начала проходить software-разработка.

swyx: And it’s all based on a coding agent based totally.

Я думаю, в силу того, насколько хороши модели в написании софта, и насколько early adopter люди, строящие новую технологию, и пробующие её и применяющие к себе — определённые виды сдвигов случатся и с другими индустриями. И много чему есть научиться из того, как это шло и продолжает идти в софте.

Jonas: And I think one of the things that like talking to, friends and family that are not in the software world that’s interesting is I do. Speeding up predictions. I do think that we are going to start seeing other industries go through what software development has started going through.

В плане — все эти интересные вопросы — до какой точки люди получают больше leverage, когда роль становится гораздо более generalist? Все эти вопросы, по которым мы видели какие-то данные, но увидим гораздо больше в ближайшие месяцы. Это случится везде.

I think by virtue of how good models are at writing software and how early adopter the people building the new technology are and trying it out and applying it to themselves, that’s certain kinds of shifts will happen too to other industries. And there’s a lot to be learned from how that’s gone down and is continuing to go down in software.

swyx: Sam, прощальные мысли? Какие-нибудь плаги от себя?

In terms of, all the interesting questions about to what point do people get more leverage, when do you start changing the role to become much more generalist? Like, all of these questions that we’ve seen some data on, but we’ll see a lot more in the coming months. That will happen everywhere.

Samantha: Не особо. [00:58:00] Всё хорошо. Я чувствую, что мы покрыли так много хорошего материала. Покрыли. Покрыли много. Придумать предсказание. Я просто думаю, агенты будут продолжать становиться лучше. Я буду делать меньше ручного кодинга — наверное, ноль строк кода, написанных за весь декабрь этого года мной лично.

swyx: Sammy party thoughts? Any flus of your own?

100% агенты, как личное предсказание, но

Samantha: Not really. [00:58:00] It’s fine. I feel we covered so much good draft. We covered it. We covered a lot. Coming up with a prediction. I just think agents are gonna keep getting better. Gonna stop doing as much manual coding, probably zero lines of code written in the whole month of December this year by myself.

swyx: о, ты не на нуле сегодня.

A hundred percent agents as a personal prediction, but

В каких случаях?

swyx: oh, you’re not as zero today.

Samantha: Я думаю, честно, это 1%, если я просто разозлюсь, и я такая: «я не хочу идти и говорить агенту менять эту одну штуку». Но

What in what cases?

Jonas: промтинг иногда — я чувствую, что работа над промтами иногда.

Samantha: I think honestly, it’s 1% if I like, just am like, get frustrated and I’m like, I don’t wanna go have it tell an agent to change this one thing. But

Да. Я всё ещё иду и редактирую руками, потому что это так — bare intent transfer, что говорить агенту, чего я хочу — это как писать эссе, где я не использую агентов для писания эссе пока — потому что процесс писания — это размышление.

Jonas: prompting sometimes I feel like working on prompts sometimes.

Samantha: Я всё ещё не выношу AI-generated writing. Так что да, я тоже не могу заставить агента писать промты.

Yeah. I still go in and manually edit because it’s so like bare intent transfer that like telling the agent what I want. It’s like writing an essay where I don’t use agents to write essays yet because the process of writing it is the thinking.

swyx: Так никакого DSPy, никакого GEPA, ничего такого здесь.

Samantha: I still can’t stand AI generated writing. So yeah, I can also can’t have the agent write prompts.

Jonas: У нас есть какой-то внутренний tooling вокруг штук prompt-оптимизации, но есть значительная часть — какие концепты мне нужно сообщить агенту или модели.

swyx: So no D Spy, no jpa, nothing like that here.

swyx: Я также заметил ещё одну штуку, которую я тоже [00:59:00] ищу — voice.

Jonas: We have some internal tooling around some of the prompt optimization things, but there’s a fair amount of just what concepts do I need to communicate to the agent or the model.

Я заметил, что ты не использовала голос для кодинга — даже OpenAI. Когда мы делаем подкасты с ними, они не используют голос. Да. И я такой — в какой-то момент это становится хорошим. Можно перестать печатать.

swyx: I also noticed another thing I’m also [00:59:00] looking for is voice.

Samantha: У нас есть люди, которые это очень любят внутри, и я думаю, мы будем экспериментировать в этом пространстве тоже.

I noticed that you didn’t use your voice to code even open ai. When we do podcasts with them, they don’t use their voice. Yeah. And I’m like at some point this gets good. You can stop typing.

Jonas: Ты используешь голос, swyx?

Samantha: We have some people who like that a lot internally, and I think we’ll be experimenting in that space too, for sure.

swyx: Не много. Иногда — это привязано к моему caps lock, и могу нажать.

Jonas: Do you use voice log?

Jonas: И когда используешь — хочешь, чтобы оно отвечало, или просто хочешь

swyx: Not a lot. Sometimes that’s bound to my caps log, so I can press it. I just,

swyx: Да,

Jonas: and when you use it, do you want it to talk back or you just want

Jonas: просто дамп. Да. Да.

swyx: Yeah,

swyx: Но brain dump хорош. Да. Потому что можно перебивать себя, уходить в тангенс, что угодно.

Jonas: just dump in. Yeah. Yeah.

Он просто всё захватывает. Да. И запихнуть в LLM — всё нормально.

swyx: But like the brain dump is good. Yeah. Because you can interrupt yourself. You can go on a tangent, whatever.

Jonas: Да. Способ, которым мы это делали с Autotab — люди записывали полные screen recordings с аудио, чтобы научить модель, как делать задачу. И одна из забавных вещей, которую мы выучили — люди использовали свой Siri-голос, где начинали говорить короткими, отрывистыми фразами и очень чётко артикулировать — потому что они были привычны: они последний раз использовали AI два года назад, где нужно было

It just captures everything. Yeah. And lop it into all m, it’s fine.

swyx: Apple повредил целому поколению ожидания людей.

Jonas: Yeah. The way that we did this with Auto Tab was people would record full screen recordings with audio to teach the model, like how to do a task. And one of the funny things that we learned was people would use their Siri voice, where they would start talking in like short, stilted sentences and enunciate really clearly because they were used to, they last used AI two years ago where you had to

Jonas: Именно. И мы должны были сказать: «нет, ты очень нативный, [01:00:00] так что говори вот так, но просто дампи всё. Можешь повторяться, противоречить себе. Модели достаточно умны, чтобы разобраться».

swyx: apple has damaged like an entire generation of people’s expectations.

swyx: Но всё ещё очень плохо. Так что voice coding всегда считалось — я считал — самой сложной частью, потому что приходится говорить технические штуки, где, типа, написание имеет значение, capitalization имеет значение, и это всё не в голосе.

Jonas: Exactly. And we had to be like, no you’re very native, so [01:00:00] you do this, but just dump everything in. You can say you can repeat yourself. You can contradict yourself. The models are smart enough to figure it out,

Так что увидим. До сих пор это было больше emotional companionship — такое — но в какой-то момент дойдёт до voice coding.

swyx: but it’s still very bad. So voice coding was always, I considered like the hardest part because you have to say like technical things that pel like spelling matters, capitalization matters and like it’s all not in voice.

Jonas: Да. У меня есть предсказание для тебя. Я предсказываю, что к концу года объём — я думаю, это займёт дольше, чем люди думают, и дольше, чем мы думаем, чтобы cloud-агенты, работающие в своих коробках, превзошли локальных агентов.

So we’ll see. So far it’s been more sort of emotional companionship, that kind of stuff, but at some point it’s gonna hit voice coding.

Но я думаю, что этот crossover случится до конца года, и, наверное, к концу года агенты, работающие в облаке, будут более чем 2x объёма локальных агентов.

Jonas: Yeah. I have a prediction for you. I predict that by the end of the year, the volume on, I think it will take longer than people think and longer than we think for cloud and agents working in their own boxes to surpass local agents.

swyx: Окей. Ты мне оставляешь открытие. Что не очень хорошо сегодня?

But I think that crossover will happen before the end of the year and probably by the end of the year, agents running in the cloud will be a multi, like more than two x the volume of local agents.

Jonas: Да, есть куча сложных вещей. Одна из них — просто сделать эти [01:01:00] sandboxes реально хорошими, и вещь, которая была частью этого запуска, на которой мы потратили непомерное количество времени — это cursor.com/onboard, где ты выбираешь репо, добавляешь секреты, даёшь ему доступ к вещам, и агент сам уходит и устанавливает вещи.

swyx: Okay. You’re leaving me an opening. What’s not good today?

swyx: Да, я думаю — всё целиком — это была моя любимая.

Jonas: Yeah, there’s a bunch of hard things. So one of them is just getting those [01:01:00] sandboxes to be really good and a thing that was part of this launch that we spend in inordinate amount of time on is cursor.com/onboard where you pick a repo, add secrets, give it access to things, and the agent just goes off and installs things.

Jonas: Да, мы много работали над этим. Sam и я в особенности провели много поздних ночей, делая это хорошо, но там ещё много чего нужно сделать. Сетап одна, две вещи. Может, он слишком медленный. Он слишком медленный — работаем. Сетап — не унитарная вещь, где всё либо настроено, либо нет, верно?

swyx: Yes, I think all the whole thing. That was my favorite.

Вещи будут ломаться со временем. У тебя новые зависимости, тебе нужен доступ к новым системам — меняешь, где живёт БД. Это одна часть. И другая часть — иметь этих агентов, работающих в облаке, более автономными. Мы реально начали видеть отсутствие памяти.

Jonas: Yeah, we worked a lot on that. Sam and I in particular spent a lot of late nights making that good, but there’s still a lot to do there, right? Set up 1, 2, 2 things. Maybe it’s too slow. It’s too slow. Working on it set up is not like a unitary thing where everything is set up or not, right?

А Sam, как кто-то, кто много об этом думал — как только модель начинает оперировать кодбейзом, есть больше особенностей, чем просто read file tool. Ей нужно знать, как запустить бэкенд, как проверить статус [01:02:00] бэкенда?

Like things will break over time. You have new dependencies, you need access to new systems, like you change where your database lives. So that’s one part of it. And then the other part of it is, having these agents run in the cloud and be more autonomous. We’ve really started to see the lack of memory.

Это очень специфично для твоего кодбейса. И даже если она отлично справляется с npm run watch или какими дефолтными вещами — всегда есть quirks. У всех есть quirks. И сделать модель хорошей в этих вещах потребует больше работы. Мы над этим работаем. Но мы думаем, это будет одним из больших анлоков — иметь их онбордженными не только в плане окружения, но и в плане понимания design trade-offs, как работает кодбейс, как быть хорошим разработчиком в любом одном кодбейсе.

And Sam, as someone who’s thought a lot about this once you start getting the model kind of doing, operating the code base, there’s more particularities that are not it’s not just a read file tool. It needs to know how do I start up the backend, how do I check the status [01:02:00] of the backend?

swyx: Это куча Cursor rules. Будет что-то другое. Будет ли это файл? Просто назовём либо markdown файлом другого имени, и

That’s very particular to your code base. And even if it’s great at NPM Run watch or whatever the default things are, there’s always quirks. Everyone has quirks. And getting the model good at those things will require more work. And we’re working on that. But we think that will be one of the big unlocks, is having them be onboarded not only in terms of their environment, but also in terms of their understanding of design trade-offs, how the code base works, how to be a good developer in any one code base.

Samantha: Не знаю. Одна штука, которую мы выучили — мы в Cursor (компании) в этом году. Есть реально классный блог-пост, который Judi и другие люди в agent quality team выложили про dynamic file context.

swyx: It’s lot crier rules. It’s gonna be something else. Is it gonna be a file? Is it. We just call in either markdown file a different name, and

swyx: Это твоя команда — другая команда?

Samantha: I don’t know. One thing that we learned at, could we be in cursor of the company this year? There, there’s a really great blog post that the Judi and the other people in the agent quality team put out about dynamic file context.

Samantha: Другая команда, да. Они работали по сути — делая много всего, file system, всё — file system. И поэтому много моего мышления лично про память за прошедший год изменилось — стало больше согласованным с этим, где это даёт агенту указатели на вещи, аннотации [01:03:00] к вещам.

swyx: Is that your team is the different team?

Второе — я начала думать о памяти как о подмножестве self-auditability агента и self-awareness. По сути, агент может захотеть предложить аннотации или ссылки или memory файлы себе, когда находит, что есть какой-то gap в его функциональности в его собственном harness, который, возможно, нужно заполнить какой-то информацией на полу-постоянной основе.

Samantha: Different team, yeah. And they were working on basically doing a lot everything, file system, everything is file system. And so a lot of my thinking personally on memory this past year has changed to be more aligned with that, where it’s like giving the agent pointers to things, annotations [01:03:00] to things.

Но есть целый ряд других вещей, являющихся побочным эффектом self-auditability, которые реально интересны — потенциально находить конфликтующие инструкции или skills и rules, которые могут быть «эти баг’ают друг друга». И также вещи как починка проблем DevEx, на которые он наталкивается.

The second thing I think that I’ve started to think differently about memory is a subset of agent self-audit ability and self-awareness. Basically like the agent might wanna propose annotations or links or memory like files to itself when it finds that there’s like some gap in its functionality in its own harness that might need to be filled by like some piece of information on a semi-permanent basis.

Я думаю, по сути dynamic file system стафф наверное очень многообещающ для памяти. И есть также эта нотация, что нужно, чтобы агент был чуть более self-aware в плане способности идентифицировать gap’ы в собственной функциональности и решать, как их заполнить.

But there’s a whole bunch of other things that are a side effect of self auditability that are really interesting, like potentially finding like conflicting instructions or like skills and rules that like might be like, eh, these are bugging each other. And also things like fixing like Devrel X problems that it runs into.

Jonas: Это такой хороший пойнт.

I think that basically the dynamic file system stuff is probably very promising from memory. And there’s also this notion of needing to have the agent be a little bit more self-aware in terms of being able to identify gaps in its own functionality and decide how to fill them.

Self-awareness в широком смысле — это реально большая вещь, которой Sam, я думаю, нас [01:04:00] подталкивала делать всё больше и больше — где агент должен понимать, как работает его окружение, должен понимать, как работают секреты. Ему нужно быть self-aware о собственном harness и окружении. И потом, и ты

Jonas: That’s such a good point.

swyx: думаешь, это не присуще модели — нужно делать.

Like self-awareness broadly has been a really big thing that I think Sam has pushed us to [01:04:00] do more and more of where the agent should understand how its environment works, it should understand how secrets work. Like it needs to be self-aware about its own harness and its environment. And then, and you

Jonas: Это специфика, да? Если он работает в Cursor versus каком-то другом sandbox — это немного разное. И потом другая часть, которая становится реально интересной — это когда модель начинает редактировать собственный system prompt.

swyx: think this is not inherent in the model you have to do.

swyx: Да.

Jonas: It’s specifics, right? If it’s running in cursor versus some other sandbox that’s a bit different. And then the other part of it that starts to get really interesting is when the model starts editing its own system. Prompt.

Jonas: Что это вообще значит? Как делать это безопасно и

swyx: Yeah.

swyx: переборщить?

Jonas: What does that even mean? How do you do that safely and then way over

Это просто research, верно? Это не — это

swyx: do that?

Jonas: я думаю, она это сделает. Да. Она будет управлять собственным контекстом. И system prompt — это часть контекста, и можно спорить про

This is just research, right? This isn’t, this is

Samantha: Да, типа другие вещи, которые она может решить включить или выключить в зависимости. И всё это, self-awareness для нас в этом контексте — это не сама модель имеет нотацию сознания, а скорее знание того, в какой системе она работает, и ограничений этой системы, и потенциально способность иметь agency в оптимизации себя, чтобы оперировать наилучшим образом в этой системе.

Jonas: I think it will do that. Yeah. It will manage its own context. And so system prompt is part of the context, and you can argue about

Это была одна из [01:05:00] первых вещей, которые я выучила в Dot, когда мы запускались — мы сделали модель — или сделали агента — или, как бы мы его ни называли в то время — он был гораздо менее агентным — сделали продукт работающим очень хорошо в определённом количестве вещей, но у него не было полного self-awareness собственных границ.

Samantha: Yeah, like other things that it might decide to turn off or on depending, and all those, self-awareness to us in this context is not like the model itself, having a notion of consciousness, but more like knowing like what system it’s operating in and the constraints of that system and potentially being able to have agency in optimizing itself to operate best in the, in that system.

Так что люди такие: «Эй, можешь сделать эту штуку?» И штука была, и могла быть сделана, и продукт такой «ой нет». А я: «но можешь же». И так по сути одна из самых ранних вещей, которые я нашла —

This was like one of the [01:05:00] first things I learned at DOT when we launched was that I we had made the model or made the agent or. Whatever we would call it. At that time, it was far less, agentic made the product work very well at a certain number of things, but didn’t have complete self-awareness of like its own boundaries.

swyx: верь в себя.

So people would be like, Hey, can you do this thing? And the thing was there and could be done and the and the product would be like, oh no. And I’d be like, but you can. And so like basically like that was one of the earliest things I found is

Samantha: Знаю, как product developer — ей нужно одновременно уметь делать штуку и иметь полное знание о своей способности делать штуку.

swyx: believe in yourself.

Это не всегда очевидно одна и та же часть промта вообще.

Samantha: I know as a product developer, like it needs to both be able to do the thing and it needs to have complete knowledge of its ability to do the thing.

Да.

Those are not always obviously the same like part of the prompt at all.

swyx: Да.

Yeah.

Samantha: Это то, что, я думаю, продолжает быть темой в экосистеме — пользователи часто атрибутируют возросший интеллект системе, которая более self-aware и более способна манипулировать собой, чтобы хорошо справляться в системе.

swyx: Yeah.

Если это имеет смысл.

Samantha: It’s something that I think has continued to be a theme in the ecosystem that users will often attribute increased intelligence to a system that is more highly self-aware and is more able to like, manipulate itself to do well in a system.

swyx: Да. Это более абстрактно, чем я когда-либо думал, в нашем Cursor-обсуждении. Круто. Это не тот вид [01:06:00] разговора, который ты ведёшь в —

If that makes sense.

Samantha: мы говорим об этих штуках всё время —

swyx: Yeah. This is more abstract than I ever thought would get at Thisor discussion. Cool. That isn’t the kind of [01:06:00] conversation that you have

swyx: улучшая

Samantha: in, we talk about this stuff all the time to

Samantha: да.

swyx: improving

swyx: агентов в целом.

Samantha: Yeah.

Jonas: Да. К твоему пойнту про agent layer и думая много про модели и harness, и продукт, и affordances.

swyx: Agents in general.

Да. Падает из

Jonas: Yeah. I think to your point right about the agent layer and thinking a lot about models and the harness and the product and the affordances like that.

swyx: Нет, я имею в виду — вы, ребята, для меня — нужный пример того, как выглядит agent lab и может быть успешным, и я думаю, люди всегда голодны до инсайтов в то, как вы оперируете, так что спасибо, что нашли время поделиться.

Yeah. Falls from the

Samantha: Да. Спасибо, что пришёл.

swyx: No, I mean you guys are like my sort of needing example what an agent lab looks like and can be successful and I think people always hungry for insights into how you guys operate, so thank you for taking the time to share.

Да. Спасибо.

Samantha: Yeah. Thanks for coming.

Обсуждение этого эпизода

Yeah. Thank you.