Dr. Mirman's Accelerometer
Welcome to "The Accelerometer," a cutting-edge podcast at the intersection of technology and artificial intelligence, hosted by Dr. Matthew Mirman. Armed with a Ph.D. in AI safety from ETH, Dr. Mirman embarked on a unique journey, to join the accelerationist dark side and found his YC funded company, Anarchy. In each episode, Dr. Mirman engages with brilliant minds and pioneers in the YC, ETH, and AI spheres, exploring thought-provoking discussions that transcend the boundaries of traditional AI safety discourse. "The Accelerometer" offers listeners a front-row seat to the evolving landscape of technology, where Dr. Mirman's insights and connections with intriguing individuals promise to unravel the complexities of our rapidly advancing digital age. Join us as we navigate the future of AI with The Accelerometer.
Dr. Mirman's Accelerometer
Dawson Chen | Martin
Ready to revolutionize your daily interactions with AI? Join us as we chat with Dawson Chen, the visionary founder and CEO of Martin, a next-gen personal voice AI. Discover how Martin differs from traditional assistants like Siri by offering a more dynamic and conversational experience. Imagine having a virtual secretary that effortlessly understands and responds to natural language, helping you plan your day, make restaurant reservations, and even send emails. Dawson lifts the curtain on Martin’s advanced functionalities, making AI feel less like a gadget and more like a human companion.
We also spotlight the inspiring story of college friends turned tech innovators who have launched a cutting-edge AI app available on the App Store. Hear how they navigated the risky waters of entrepreneurship, from beta testing advanced features to making the bold choice to leave college and fully commit to their vision. Learn about their acceptance into Y Combinator and their dream of creating a "ghost intelligence"—an AI so advanced it surpasses human capabilities in natural interactions. If the future of AI excites you, this episode is a must-listen!
Accelerometer Podcast
Accelerometer Youtube
Anarchy
Anarchy Discord
Anarchy LLM-VM
Anarchy Twitter
Anarchy LinkedIn
Matthew Mirman LinkedIn
This is like such a scary thing from the very beginning, like everyone's already ready for the potential negative impacts. It's a great time to like go after a consumer dream.
Speaker 3:Hello and welcome to another episode of the Accelerometer. I'm Dr Matthew Merman, ceo and founder of Anarchy. Today, we're going to be discussing the usage of AI in our everyday lives with Dawson Chen, founder and CEO of Martin. A personal voice, ai Dawson, can you tell us a little bit more about Martin?
Speaker 1:Yeah, for sure, and it's great to be on the podcast with you, matthew. I'm really excited to show you a little bit of what we're working on. So Martin is a personal voice AI, so you can think of him as Siri, but way smarter and a lot more capable of doing complex tasks, and also more conversational, so you can actually understand you just like your friend does, and you don't have to adjust your language very heavily, like when you're talking to Alexa or Siri Like Siri, please play me a song and you'll probably activate. Sometimes, when you're talking to your friend about like sorry, siri will just activate. And so Martin is much more conversational. He understands when to start talking when you're addressing him, and has a lot better social skills.
Speaker 3:Fundamentally, I feel like Siri's trained us to use Siri in a very particular way. I always ask Siri for the same three things Like Siri, can you take me from here to here? Like Siri, can you take me from here to here? Like siri, can you please? Like, play me this one song. Like do you encounter users doing stuff like that with your app?
Speaker 1:yeah, for now, like, because we started with such a like new, like something that people aren't used to at all, we tried to make it as smooth and seamless as possible for someone to onboard themselves to become a martin user. So in that mission we're trying to make it so like you don't have to adjust your language at all and in fact you might be able to even like adjust your language to be even more natural than you do talking to a human, because there's no like like. Martin will always understand you, no matter how like gibberish you're talking, and you'll. He won't have any judgment or like any threshold um above which you'll like get irritated or like stop understanding you. So like. The opposite direction of theory is where we want to take it.
Speaker 3:So you're using llms under the hood, right, yeah, so how do you avoid it becoming irritated as in like the?
Speaker 1:I feel like I've seen llms become irritated.
Speaker 1:I see I think part of it is we have a great. We have a great set of voices. So the voice is like a very if it's a cute voice, if it's a warm voice, that's the first part, but sometimes that's not enough. Um and martin, we've trained him, you know, with a lot of different prompts that we use so that he adapts to your use case as you direct him. So if you're talking about planning your next month or something, he'll go into secretary mode and he'll start treating you as if he was your secretary trying to serve you in the best way. You're a secretary trying to serve you in the best way. So we have some dynamic prompts that make some like really easy to interact with, regardless of, like what direction you're taking the conversation.
Speaker 3:That's really cool. Can you show us an example of using Martin?
Speaker 1:Yeah, sure so he's on your phone just like a normal app, and I have him on my action button, so we'll just open him up here. Hey, martin, how's it going? Man, I'm on a podcast right now with matthew. Do you want to say hi to everyone?
Speaker 2:hey matthew, hey dawson, how's it going? Anything new or exciting happening with you?
Speaker 1:martin, i'm'm on a podcast with Matthew right now and we're talking about you and the future of AI, especially in consumers' lives. So do you want to tell us a bit about, maybe, yourself and introduce yourself a bit?
Speaker 2:Sure, I'd be happy to introduce myself. I'm Martin, an AI design to assist and provide information. I'm here to help, answer questions, offer suggestions and have engaging conversations. I'm always learning and growing, just like you, so what would you like to know or talk about?
Speaker 1:How about, like I'm in San Francisco right now and I might stay here for dinner? What about find me some Japanese places?
Speaker 2:I'll focus on finding some great Japanese restaurants in San Francisco for you. Let me see what I can find. Give me a moment to gather the information. I found some highly recommended Japanese restaurants in San Francisco for you to try. One, Akiko's Restaurant located in Union Square An inviting atmosphere.
Speaker 1:The second, one sounds pretty good. Can you? Um, can you email me some some items from their menu so I can take a look before I go?
Speaker 2:give me a second sending notes to your email I just emailed you, sir.
Speaker 3:I feel like a really interesting aspect of this is the interruptibility, like I heard it interrupt you a couple times and I heard you interrupt it and that feels like almost way more natural than Siri in a lot of ways.
Speaker 1:So we're trying to make it so like at the limit you can talk to him like even more naturally than with a human right. So I think in those cases, like, you should have the flexibility, the flexibility like whenever you have a thought you should be able to like instruct him or tell him and for him, whenever you retrieve something, he should either show it to you on the screen or like suggest to you that he's like he's found some new knowledge, um. So of course, it's a very early version, so we haven't like integrated a lot of visual components, but eventually we wanted to to flow in the same way that, like a human conversation flows't like integrated a lot of visual components, but eventually we wanted to to flow in the same way that, like in a human conversation flows, because there are a lot of visual cues as well.
Speaker 3:So this is like a very early, like beta version of that I feel like if you get more naturally than talking to another human, almost what you're getting at is a system where you're talking to yourself, because what's more natural than talking to another human is talking to yourself in your own mind. So have you considered going beyond the voice interface and like doing neural implants?
Speaker 1:um, I think to start, voice is like a very natural yet already like super ambitious medium to nail um so far. Like it's a lot harder than we and then most people like uh expect in the very beginning, because it's not just a term-based I talk, you talk, I talk, you talk um. There are a lot of like interruption handling and and like recalculating as new thoughts come up and as the the the speaker like inputs new ideas. So doing all that with low latency and having it really natural, we think we're still a ways away. So we do want to like tackle that first. Um, at the same time, like along the way, we want to make martin do more and more things and figure out what are the biggest use cases, uh, in terms of like executing um, like taking actions on your behalf, whether it's like ordering DoorDash or placing an Amazon order or calling a restaurant to book a reservation, all of those things. We're figuring that out like along the way while we're improving the core voice tech.
Speaker 3:So what's been the hardest part of developing Martin so far?
Speaker 1:Well, I think to get him to where he is today, we had to do a lot of work with, like, the voice infrastructure and that would be probably like a huge amount of our engineering effort. Um, and, as I described earlier, it's like when martin should start talking, how he should handle you interrupting him and how do we get it down to low latency and giving you, like also opportunities to jump in and he doesn't talk too much. He knows, like, when to ask a follow-up question, when to like go on a long speech and when to like just give a small acknowledgement. So a lot of those things. I think is so far, the hardest engineering thing we've done.
Speaker 3:What's been the most useful tool, like non-traditional tool, when building Martin.
Speaker 1:I mean we do a lot of. Obviously we have to do a lot of testing because the whole voice infrastructure is like so early, so but I don't know if that counts as a tool there's a lot of like us having to use Martin for like hours in order to like figure out like what edge cases of conversation he's not good enough at yet. So there's a lot of like just sit down and have like a conversation about your life with Martin and we actually have a lot of friends who help us with that and that's like something very unique about the way we have to build product is we have to like spend hours and hours conversing with him to figure out what parts of the voice stack he's improving.
Speaker 3:How do you measure if the conversation is natural?
Speaker 1:That's why it's very hard to boil it down to a few cases that you can just test, and oftentimes for us in the very beginning've been like just spending a lot of time and like um speaking with martin and then speaking with a human and comparing those.
Speaker 1:It takes a lot of like digging and like a lot of creative work. I think it's not so straightforward for us as like just getting latency down, um, although like those are like there are a few numbers that we're always trying to drive down, which is like like seconds to first talk and like how many seconds he takes to perform an action, like whether that's a search or sending notes to your email or sending a reminder, all those things when I get it super low, super low, um. But also like um like also the amount of times that he like interrupts you accidentally and the amount of times that he will like basically cut you off, like those are also numbers that we're always trying to reduce. So there are a few analytical parts, but like starting from ground zero, we have to do a lot of like non-structured testing.
Speaker 3:Um, yeah, I feel like you instead of getting him to stop interrupting you, you want him to interrupt you at like points where he decides it's a good idea to interrupt you. Yeah, like he just wants you to shut up, like right now. That would be a normal interaction.
Speaker 1:Yeah, like if he heard enough from you to like make the call of like what he should do or what value he should pride, then he should interrupt you. There's a great clip from her, the movie where like the operating system is like booting up for the first time and like it's really aggressive in interrupting the user because it's like it's already so many calculations ahead based on just a few words that you already said. So that is something that would be really cool to have.
Speaker 3:Yeah, yeah, I feel like also you want it to have like normal human interruptions, because like you don't want it to just like interject with its thought, you want it to be like but, but, but, but, but, but, like, yeah, that's, that's like a really big uh like insight we had early on, which was there are a lot of like visual cues in a conversation that make the flow happen, like the nods and the opening your mouth.
Speaker 1:I'm about to start talking, so give me a chance. All those things are stuff that we've been trying to work into the interface. For example, like he spins when he's thinking and when he's about to talk he'll like bubble in so he's like suggesting I'm about to talk and if you keep talking while he's like suggesting I just keep talking, he'll like get overridden by me. So there are a few like visual cues that we're working in, but this is like a very other version. It's just a hexagon.
Speaker 3:There'll probably be, like other things, we need to add to that too. So are you using agents behind the scenes?
Speaker 1:As in agents, to help execute one task.
Speaker 3:I mean, I guess, yeah, the question is, what is an agent? But what is it? The open agents library?
Speaker 1:So far, no, but yeah, it's definitely we. We probably will look into, like as of now, a lot of the things that he does are are, like they're asynchronous tasks, which, um, I guess you could call that an agent, but, um, the the tasks we started off with are pretty straightforward, um, the the ones that will require, like more autonomy, those things will have to build a lot more like customizable, I guess, agents to help execute those. Um, that requires more judgment along the way.
Speaker 3:Yeah, so going a little bit into like startup life, like this is a very consumer app, right? Yeah, like I feel. I feel like a thousand people have told me don't do consumer. What gave you the chutzpah here to just go out and say you know what? Screw it, I'm going to do something in the consumer space.
Speaker 1:Yeah, so we actually started Y Combinator with a B2B idea, so we were working on customer service chatbots for grocery stores and we were doing fine.
Speaker 1:We're doing fine like we had b2b customers, um, but we it turned out like for us, as really young founders like my co-founders are we're all 19, I think for us, without like too much industry background in a particular space and with a lot of like freedom and like not many obligations, it's a great time to like go after a consumer dream and like better than any other time in our lives, because we have no like bias towards any b2b space and we have like all the creative freedom in the world. Like this is the best time for us to go after like a moonshot idea and like the space of like new llm applications have sort of blown open the opportunity in consumer, I think, whereas if we were to do a consumer idea like five years ago, it would have been really tough and, like you, would actually be in like the barren cold for for months. Um, because there's just the supply of ideas has just been blown open recently.
Speaker 3:So have you launched this? Yet Is this public?
Speaker 1:So we have a version on the App Store and it has smooth voice conversations and some very basic features like setting reminders, sending notes to your email, and it also has good memory, so it's like a conversational companion. We have that version on the App Store, but we also have a group of beta users who are basically testing with us the more ambitious, larger leap integrations and features like actions that Martin can take, and so we have a public version and you can sign up to be a part of our private beta, to be an intimate group of early users that give us a lot of a private beta to be like an intimate group of like early users that give us a lot of feedback.
Speaker 3:Yeah, okay, that's really cool. So you're really young. When you founded this, what's that been like?
Speaker 1:um, so my co-founders are. We're like middle school friends and we've sort of been like founders for a while and we just founded smaller projects in the past. So it's not so unusual for us to work for ourselves, and maybe it's even more natural for us because we haven't really worked in a company before. So one of my co-founders, harsh, is from MIT. He and I worked on a campus tourist startup last year Not really a startup, it's like Airbnb for giving college tours one-on-one college tours so we thought that was really cool while we were on campus and we did a lot of tours and we made a marketplace to CMU. For freshman year, we started a nonprofit in high school which would bring environmental jobs to the homeless in San Jose. So we're all entrepreneurs.
Speaker 1:But this is when we met in May of this year for the summer. We wanted to do something really big, something that we're actually prepared for after so many years and something that's like we're in the right place at the right time with with ai and with us being in college and finally like ready to do something big. Um, that's why we all jumped together and it actually wasn't that, um, unnatural at all. Has it been hard at all? Yeah for sure. I mean, it's like way different from college you can imagine, especially like living out here, I think in the adult world. It is like quite a change and we're still sort of figuring that out. We don't quite know, like, how you're supposed to live as an adult yet. But does anybody? I guess that's the question. I don't know. No one really has an answer for us, so maybe no one.
Speaker 3:I mean I feel like the older I get, the more I'm just like wait, shit, I'm never going to feel like an adult. It's been like 15 years since I was supposed to be an adult, but it still feels like I'm just making things up.
Speaker 1:Actually, on the work side it's very natural for us and we're in our zone when it comes to working on a project. There isn't anything else I can imagine that we have better fit with as a career path or as a first job even. But it is quite a big change from college, although in college we were all working on startup-y projects anyways. Harsh and I were working on the campus tours thing, arjun was working on some his own like projects at school and like classes was very much like an afterthought sort of thing.
Speaker 3:So in that sense it's not that like jarring of a change did you, uh, did you leave college in order to start this startup?
Speaker 1:so we started in the summer and now we're all on leave of absence.
Speaker 3:So, okay, wow, yeah, we can basically work on it for as long as we want. That's a major leap of faith, though, isn't it?
Speaker 1:yeah, I think, um, well, obviously, this is like such a one-in-a-lifetime opportunity like for the three of us to be in this place, like with everything, like being blown open by AI and the reverse, being like really energetic, ready to do a lot of like really valuable work in the same place. This is like a really rare occurrence, so that we want to like really take advantage of. We don't think that that chance will come again. As for, like you know, leaving college, I mean it's actually just like one semester at a time. Really. You don't have to decide for like the rest of your life if you're going to go back to college. So, in that sense, I don't think we're too like stressed about it.
Speaker 3:What was the point where you knew like we have to pursue this full time?
Speaker 1:It was definitely like gradual. I mean, when we got into Y Combinator, that was a big moment and I think we mostly decided that day when we got into YC. That was in May.
Speaker 3:Was this your first time applying to YC? Yeah, so usually YC, they want you to have made the leap before applying to YC, but it sounds like you guys didn't have that that so our yc application was really last minute.
Speaker 1:we applied like two months after the deadline and I think we like did our interview and we got like the offer um like two days before the retreat started, which is like the beginning of the batch. So we had like four hours to decide, I remember, and we had like all pretty much decided before then. But like this was like it really hit us and we like called some friends, we'd like talk to our parents a little bit and I think for us growing up in Silicon Valley also like our parents all having worked in startups and in like tech companies, it wasn't that hard for us to make that decision um, especially after talking to our parents and stuff.
Speaker 1:so your parents were supportive yeah, because they're all like entrepreneurs in some sense and like for them. They know that we're gonna do this eventually, probably, or get it over with. I think they expected me to do this, probably five or ten years later, so you might as well cut to the chase, why not? I guess.
Speaker 3:That's really wonderful.
Speaker 1:Yeah, it's really great to have a Bay Area background because of that.
Speaker 3:What scares you?
Speaker 1:about the future of AI. Well, I think a lot of people will build in a lot of different directions, like every possible use case will get tapped into and the consumers will start making some choices. And it's possible that you know like a lot of the darker use cases will get attached to because they're just so powerful.
Speaker 3:Do you ever worry about people using your app as an intimacy bot?
Speaker 1:So people have tried but we've done some really good guardrails so that Martin is like a very professional guy.
Speaker 3:He won't like. It could be a very professional intimacy bot Like your best friend ever at work, right, Right right, right Can you fall in love with it. Like will it prevent you from falling in love with it?
Speaker 1:I think people, you can have a one-sided relationship with Martin? I guess no, but he's as of now, even though very early. We're glad that people haven't broken him yet, and I think that's just thanks to the LLMs that we're using. They have built-in guardrails. Yeah, I think, going into the future, it'll be really interesting to see what type of regulation will happen in this space, and I know recently there was this executive order on AI and it's still so early to tell vouching for different types of regulation.
Speaker 1:It would be best if these people can stay really up to date with all of the things in AI, maybe unlike the ways that they kept up to date with the internet, with social media and stuff like that. I think that they feel like they took a long gap before they finally hit them like how scary it might be. So, at least for AI, what we have going for us is it's like this is like such a scary thing from the very beginning, like everyone's already ready for the potential negative impacts they could have. That sounds really bad, but we're over-prepared. We're over-prepared. There we go, yeah.
Speaker 3:Yeah, so your app is one. It's always on, it's always listening, right? Well, it's not listening unless you're on the app. Okay, is the intention for it to be an app like Alexa or like Google Home that's always on and always listening?
Speaker 1:We want to make it as smooth as possible to first interaction. But because it has such a long-term memory and it's so smart, we don't want it to always listen to everything, because it would just be even more sensitive than Alexa or Siri, because it can also take actions. So right now it's on the action button. In the future we could basically see it. We want to make it basically always one click away or one physical action away, but definitely not like always recording you.
Speaker 3:Is it something that two people can talk to at the same time? Can it tell the difference between people?
Speaker 1:this is something that we, it's been in the works and we'll probably have that feature pretty soon. Um, I think it can. It can do it actually decently. Right now, uh, it can't exactly tell who's talking but, based on context, sometimes it can like it can do it actually decently. Right now, uh, it can't exactly tell who's talking, but, based on context, sometimes it can like it can decide. Um, it's, it's usable, we can, we can try it.
Speaker 3:Have you ever tried putting two of them next to each other and having them talk to each other? Yeah, yeah it's.
Speaker 1:It's pretty interesting. They they actually get acquainted pretty well like they. They get to know each other. They start talking.
Speaker 3:It's it's pretty nice but you have to also.
Speaker 1:I want to see that introduce them, though, like they just start talking to each other, then they get a little confused.
Speaker 3:You have to introduce them like hey, you're talking to like another, martin, yeah another, but it knows the concept of martin, yeah, yeah, does it have a concept of itself? Yeah, it knows himself pretty well.
Speaker 1:Okay, so it knows his capabilities and those capabilities and like what its mission is and what what it's trying to um like, um achieve for the user.
Speaker 3:Yeah, what are some limitations that you're looking to get rid of next? Um limitations in this like yeah, and that happened. Like the ability to order pizza in the ocean or like multi-hop pizzas, right.
Speaker 1:Yeah, what we want one like early thing we want to do is allow people to search and get information and brainstorm, like in a really productive conversation, really easily, and so far there really isn't a really natural way to do that. So one limitation is let's say, you have a great idea and you want to brainstorm with someone who's an industry expert and also knows you super well. That's something that Martin does really well in its turn form. So I'd say that's one of the early limitations that we're breaking.
Speaker 3:What's some idea in AI that you're looking forward to?
Speaker 1:I'm looking forward to some more infrastructure for testing would be nice. I think right now it's a lot of guess and check and something also to help with prompting. I think right now it's a lot of like guess and check and something also to help with prompting, because I think prompting is such a big problem and, like prompt engineering, is something that there's a lot of tools that can help with that. I think in the future, one main one that, like we always wanted to exist you've probably thought of this like if you have like a gpt4 prompt, can you just make it into a llama 7b prompt or something like that, like prompt translation, translations yeah, that's like a very simple one that we've thought of like someone that must exist.
Speaker 3:someone should make that yeah, well, have you considered asking gpt4 to translate? This is translated yeah, we haven't.
Speaker 1:maybe that should, yeah, maybe they should just know.
Speaker 3:I mean, yeah, is it aware of other LLMs? I mean further, gpts are definitely going to be aware of it because we're going to have the data on the internet already. That's true. They're going to be aware of GPT-3, GPT-4, the existence of prompts, and they should be able to translate between previous gpts. I guess the the existence of future gpts.
Speaker 1:That's interesting though like you can try to speculate, like, yeah, prompt me for, like your, your 12th generation, yeah, interesting. Yeah, there are a lot of tools that we would love to like play around with. Um, like there's so many startups that are always reaching out to us asking for things, and we're really excited to see what the next generation can use to make this whole thing easier and more productive for founders and for developers.
Speaker 3:Out of all of the exciting tools, how do you decide what not to work on?
Speaker 1:tools like how do you decide what not to work on? First of all is like if something is so um, like core to someone's technology they might not want like a tool to just to take care of it, and something like so easy to build in-house or requires a lot of customizability that also you might not be able to like have a foothold. This is just from tools that we've used that we haven't liked. It's hard to know what to shy away from. You probably should just go from personal experience. If you try to build your own LLM application, you see what big problems you ran into that you wish could be taken care of. That's the best way to start, I think.
Speaker 3:Do you think that we've achieved?
Speaker 1:AGI Kind of sounds like Marden is basically an agi yeah, martin is more like well agi I think is is for the big lm players but like what we want to add to it is like um, let's say like passing the Turing test for voice is sort of like AGI. We want to get to like even better than that, which is like it's more smooth than talking to a human, which is not like a perfect analogy. It's not perfect, but in some sense it's as smart as a human or more smart than a human and more natural to interact with as an interface. So that's the so super intelligence. I think intelligence actually for us more smarter than a human and more natural to interact with as an interface. So that's the so superintelligence. I think intelligence actually for us doesn't need to be crazy intelligent. If it's intelligent as a human, it's already a great user. Supermedium intelligence, supermedium intelligence.
Speaker 3:Yeah, or supernatural intelligence Like ghost intelligence. Ghost intelligence, that sounds pretty good, that sounds cool. I already get ghost intelligence. Ghost intelligence, that sounds pretty good, that sounds cool.
Speaker 1:Yeah, we get ghost intelligence, but I think intelligence is like the core that supports um, a lot of like things that we have to build on top. What we want to achieve is like the naturalist user interface that there could be, which is like voice plus a personal ai, um that knows you, and all those mixed together, we want to create like a really great assistant type product.
Speaker 3:Well, that's really wonderful. Thank you for coming on, Dawson. Yeah, thank you, Matthew. I had a great time. Yeah, me too.