LLM Zoomcamp: Free LLM engineering course. Register here!
Season 24, Episode 3

From Notebook to Production: Building End-to-End AI Systems | Mariano Semelman

Show Notes

Links:

Timestamps

Click any timestamp to jump to that moment in the video

Transcript

The transcripts are edited for clarity, sometimes with AI. If you notice any incorrect information, let us know.

Community Introduction and Slack Engagement

Alexey: Hi everyone, welcome to this event. This event is brought to you by DataTalks Club which is a community of people who love data. We have weekly events and today is one of such events. If you want to find more about the events we have, there is a link in the description. Click on that link and you'll see all the events we have in our pipeline.Do not forget to subscribe to our YouTube channel. This way you'll get notified by YouTube about all the future events the one we have today. Last but not least, we have an amazing Slack community where you can hang out with other data enthusiasts. The link is also in the description. (0:00)

Alexey: During today's interview, you can ask any question you want. There is a pinned link in the live chat. So right now, I see on my phone that the pinned links are at the bottom, not at the top. So YouTube is running, maybe an AB test or something. I don't know. (0:33)

Alexey: This is something new, but you'll figure it out. So the link is pinned. So use this link please to ask any question you want and we will be covering these questions during the interview. Now I will stop sharing my screen. I will open the questions we prepared for Mariano. (0:50)

Alexey: If you're ready, was it Mate? Yeah I'm having some mates. Yeah. So for if it's not obvious for you where Mariano is from now you saw mate and you probably can have a very educated guess. Do people outside of Argentina drink mate? (0:00)

Mariano: Actually yes there is a big community in Syria actually the largest importer of mate outside the region of Argentina Brazil and Uruguay is Syria. They drink it the same way as we do. I think there was a story we probably have to research but I think there was some migration at the beginning of the previous century of Syrians that came to Argentina and then came back to Syria and they brought it back. (1:28)

Mariano: Actually here in Spain the way to get the yellow amate times are the Syrians that import it right into Europe. (2:05)

Alexey: But you can use your Argentinian connections to also sneak in some mates? (2:12)

Mariano: Not at all because it's a bulky thing that fills your suitcase. We do bring other things like dulce de leche . We like other stuff. (2:19)

Alexey: So, Mariano and I went back. So, we work at the same company, OLX. Mariano back then he was first a senior data scientist then a data science manager and I was a lead data scientist so we collaborated a lot. At some point our paths parted way so Mariano now is in Spain in Barcelona? (2:34)

Mariano: Yes. (3:01)

Alexey: Now I am still in Berlin. At some point Mariano joined OLX back. So, Mariano is a lead data scientist and machine learning engineer with 10 years of experience in e-commerce. Even though I no longer work at OLX, I still really enjoy talking to people from OLX. I am really happy to have Mariano today on this podcast interview. (3:07)

Alexey: The reason Mariano is here is I wanted to interview people who are hands-on practitioners and who work with GenAI. I heard what Mariano was working on, what kind of project he was working on and I thought I had to invite him and luckily he agreed. So now we have Mariano here. I still have the rest of the bio that I probably need to read. (3:07)

Alexey: He focuses on building end to end solutions and explores how agentic tools can make the path from research to production more seamless. At OLX, he works on AI powered media solutions that simplify the listing process for sellers. Mariano, I'll ask you to unpack what it means. I have some ideas because I worked at OLX. (3:07)

Career Journey: From Argentina to Barcelona

Alexey: I will ask you but before that, this is not the first time you're here. For those who don't know you can you please tell us about yourself and about your career journey so far? I mentioned a few things but I omitted many things so can you tell us more about yourself? (4:17)

Mariano: So originally from Argentina. I live in Spain for the last five six years. (4:36)

Alexey: Did you live there really that long? (4:47)

Mariano: Yeah. I was one of those that made life decisions during the COVID. I decided to change countries during 2020 and I moved here 2021. (4:55)

Alexey: I remember because I was sitting there in the office and doing interviewing somebody and you were sitting nearby and then after the podcast interview. You came saying, "Hey, good job. I really enjoyed this interview." Because you were sitting next to me and listening to this interview. (5:11)

Alexey: Now you say 5 years ago you moved to Spain. So it means that you have been doing podcast for a long time already. Wow. I never realized that. So look, I don't know. (5:11)

Alexey: Do you number them? How many have you done by now? (5:11)

Mariano: I have no idea. Maybe I should check. You should check. Yes. So I moved from I worked in Argentina, I worked in Berlin, now I work here. (5:41)

Mariano: My background is in computer science. I did master in neuroscience but I never used that part. I always was interesting in AI. So I've been always working one way or the other and machine learning, data science, machine learning engineer. (5:41)

Mariano: I started as software engineer when I was still at university but as soon as graduated I started doing data science and ML and AI and so on mostly applied to e-commerce. I did some adventures in other areas mostly fintech. Was it fintech? No, identity verification so this is it was for financial institution. (5:41)

Alexey: Yes. Correct. (6:47)

Mariano: So it is check that the face the document are real and that they match to each other and it's a real document. Um and yes moving around different places. I've been always focused on building applications. The word application, not as in the mobile application, but in terms of ML or AI application to solve a specific problem or user problem or business problem. (6:52)

Product-Driven AI vs. Traditional Reporting

Mariano: With time I learn to work very closely with product functions being product managers and so on as well as product analytics as well. So I work very closely I usually like working touching something that touches the customer directly or that it affects what they experience in a more or less direct way. I never worked or didn't work that much in AI or data science applications where the result is a report or BI approach. I don't have that much experience on a large BI things. (7:18)

Mariano: Small things here and there of course but for example doing something large that helps decision making, no. Mostly it's a product product applications. (7:18)

Alexey: Yes. Strong many years working in search and recommendations. That's how for me this is what I think about you. (8:24)

Mariano: Yeah. But not anymore. It's been years I don't work with that. It's still in my head but not anymore like I used. (8:36)

Alexey: What do you do now? (8:42)

Mariano: So now mostly it's GenAI applications and some others and vision as well. Something jack of all trades. Okay. Like getting things done. (8:45)

Mariano: I call it like sitting with teams, figuring out what's the problem you're trying to solve, what tools can we use to solve it, and driving it until the end. My last few projects are around helping sellers. (9:01)

Alexey: Maybe we can take a step back and unpack what I said at OLX, he works on AI powered media solutions that simplify the listing process for sellers. What the hell does this actually mean? (9:21)

AI Media Solutions for E-Commerce Sellers

Mariano: OLX, for the ones who don't know, is similar to Craigslist if you're in the US, LeBonCoin if you are in France, Gumtree if you're in the UK, or Facebook marketplace. (9:41)

Mariano: I think that still exist, but not sure how popular is these days. Wallapop if you are in Spain. Secondhand marketplace. OLX is quite strong in two specific domains which are real estate and motors, cars. So secondhand motors and secondhand houses which are most of the cases secondhand. (9:48)

Mariano: Yes we can read your comment Lord Dracula. So the point is that these applications are mostly to help sellers being for cars, for houses, for anything to better or easier publish their ads at OLX. (9:48)

Mariano: Now one of the applications is the world is moving to video. You're broadcasting. Like when you consume media, even Reddit nowadays is mostly videos. There are still pictures as well, but the marketplaces are still image based. (10:44)

Video-to-Ad: The Future of Marketplaces

Mariano: If you go to Amazon or to eBay, videos are slowly starting to appear, but they are not there. Well, if you go to the Chinese marketplaces, it's only video. You go and you mean like AliExpress or TEMU. Those guys, it's video everywhere. (10:58)

Mariano: So now we're thinking how to exploit this media format which is ubiquitous, common, which is everywhere these days. We have a couple of solutions in we at OLX we like calling these projects something to something. One of them is ad to video meaning we given an ad that it's only images we create for you the video we generate the narrative the narration as well with GenAI to show when the animations, the music, and you end up with a what we call a short format video, sometimes vertical, sometimes horizontals depending on the platform, with the objective for users that don't have videos to be able to have videos. (11:32)

Mariano: While other in other applications what we do for example is video to ad meaning okay I recorded the video of my car I have a exterior interior of the car the engine the sound. (12:48)

Alexey: That sounds very very useful to me. (13:03)

Mariano: Then the seller comes and says okay upload the video and forget it. Then everything gets created from there like we detect which car is model version all the details as well as the extracting the images because you still need everybody wants to watch a video when they are searching. (13:10)

Alexey: To be honest it sounds way more useful than ad to video. Like let's see because for me as a content creator I always have this problem so let's say I create a tutorial and I have kind of similar problem. (13:36)

Automated Content Creation for Sellers

Alexey: I can first create text and then from this text I can create a video or I can create a video and then from this video I create a text tutorial and for me the second option is easier. So I already have a video then I transcribe this video. I take pictures from this video and I have a tutorial and then at the end I post. (13:51)

Mariano: I agree in that sense. I agree if you're already in that flow of creating videos yes exactly it's much easier. Now from a business perspective not all the users that create ads at OLX are used to creating videos because for two things. For historical reasons it was never necessary. (14:10)

Mariano: And for industry standards or the demographics as well like the users that create the ads maybe they're not on TikTok they're not used to the video format. So they are the owners of a large dealership. Some have adopted this new media but not everyone. (14:39)

Alexey: Because I imagine if I am a dealer and I have a ton of cars. So I just start recording and then I go around one car then I go around to another car and in my ideal world I just upload this long video and then it would cut it and create like 12-20 ads from that. (15:11)

Mariano: You should talk to a product manager. Like there are some limitations technically but that's the dream. Yes, we want to achieve that level of automation. (15:28)

Mariano: That's the dream now it comes, that's what we dream of. Then it comes to the complications of making it work because then that's where we use AI because it's not magic and some things break and then those takes time. Dealing with images, you can probably run a model on an image in milliseconds. (15:42)

Mariano: A video is thousands of images. So you have to treat it differently. So yes there are other applications I can go later. (15:42)

Alexey: I think if we talk about this too, we already will fill up the hour. But for me when I think about you, you're somebody who is taking things end to end and so for these problems and in general for you, what does it mean? Like when you hear end to end for you specifically, what exactly does it mean? (16:29)

Alexey: For example for these projects when you take ownership of a project end to end like let's say take this video to ad or ad to video, for you when you work on these things end to end what exactly does it mean? (16:29)

Mariano: Good question and I think I suggested this topic because it really goes to my heart. I personally learned and this is because of personal experience. (17:15)

Defining End-to-End Ownership in Data Science

Mariano: I personally learned that the projects I was the most successful in in data science because not every project has the same impact in the business. Some has more, some has less. We all start personally me I started doing AI, ML, data science because I was interested in the topic. (17:27)

Mariano: With time the way you measure yourself, how well you are doing is also how much impact you're doing in the business like you need you want a measure of success. In my experience, the projects that had the most success were the ones I was involved the most in end to end. The beginning sometimes you create a model, a notebook or this was years ago now. (17:27)

Mariano: Now things have changed a little bit. But you will create the model, share it with your colleague, explain how it works and then tell they will put it in production. (17:27)

Alexey: So this is the classical I'm a data scientist I train models you're engineer please take my model and make it available? (18:33)

Mariano: Yes and actually depends on the companies some companies push for one mindset or the other mindset or teams. The manager everybody has their own view and there are situations where it's valuable to work like that but in my experience the ones I was the most successful is when it's end to end meaning you collect you define the requirements with the product. You collect the data, you run create the model, you create the application where it will run being Kubernetes deployment or the the batch the cron job where it will run doesn't matter. (18:40)

Mariano: The serverless it doesn't matter like you create where it's going to and you operate on it and you monitor it and you make sure it's working the way you intended it to work. Because I noticed that most of the problems and the learnings come from that cycle actually. You shared some time ago this sometime many many years. (18:40)

Mariano: The the CRISPR ML, do you remember that? CRISP DM, I think it was called or something like that. CRISP, learning from maybe we can share later the link to those resources. (18:40)

Mariano: These cycles things change with time now. Now we train less and less models. Now we use models as commodity more and more we use endpoints that are models. (18:40)

Mariano: But still the concept of learning from your production mistakes for me makes sense. (18:40)

Alexey: I recently did another article revising this CRISP DM and how it applies to modern world and it actually still quite applies. Maybe you don't train the model yourself, but still all the steps you need to do in order to have a working thing. Surprisingly, they haven't changed since this standard appeared in 1991 or something when I was 2 years old. (20:43)

Alexey: So, still applies. (21:04)

The Longevity of the CRISP-DM Framework

Mariano: Yeah. I think everything has changed but not everything has changed. It feels revolutionary what we are living now but still some things you need to evaluate your model, you need to have monitoring, you need to know when it's drifting.So coming back to your question of end to end, yes of course it's hard to ask for somebody to understand and manage in depth all of these skills like it's super hard to be good at all of these things at the same time. I don't claim to be. But I do think that you should be curious to get involved, to collaborate with the if you're in data science. (21:12)

Mariano: I assume most of the audience here are data scientists. If you're a data scientist or analyst to work together with engineers looking together what your model is doing, how it's receiving the data, what data is receiving to collect historical data, alerting and so on. Yes, and if you are able to do it yourself, even better because you will know the in and ins and out of how it was deployed. (22:05)

Mariano: I have thousands I don't have any one reason now but thousand more stories of a model receiving a parameter in the wrong format or being called too soon or too early in the process and not having the right data that we assume during training. There are uncountable things that can happen when you put the model in production that you don't account and whoever is putting it if they don't have the context of what you use to to measure your solution normally now you don't train but you measure your solution offline before going to production. You say okay I think that with this solution we can be 80% accurate to solve this problem. (22:39)

Alexey: Yeah, you probably started with some assumptions? (23:30)

Mariano: Yeah. (23:38)

Alexey: I have a bit of an off topic question. Maybe not off topic because this is something you also mentioned in the document but maybe I bring this up. You mentioned the old way of doing is data scientist is working on a model then they just offload this to an ML engineer. (23:38)

Alexey: ML engineer is making a production ready then they kind of offload the service to SRE, SRE is watching and monitoring. What you were talking about as a data scientist it would be nice to be involved in all the steps but you also noted that sometimes in some companies it still makes sense to do it the old way meaning you have this clear separation between roles. My question is now that we have all these AI tools like Cursor, like Claude code, you can just get an engineer let's say a ML engineer or just any software engineer and with tools like that they can do this thing end to end without any data scientist or anyone else. So do you think it is the case like do we still need to think about the old way where data scientists would do their part ML engineer would do their part or SRE would do their part or do we even need to care about these roles anymore? (23:38)

Alexey: More like where you just call yourself an engineer and then you use these tools like Claude code, Cursor and codex whatever assistance you use and then you can with these tools cover the entire cycle, entire loop that we described like refinement collecting the data, creating the model creating application operating monitoring because you can all you can do all of that with the help of agents. (23:38)

Mariano: So actually it is happening. I saw that my company look I changed companies. It's going to be one year and a half. (25:21)

Impact of Agentic Coding and GitHub Copilot

Mariano: The previous one, agentic coding became a thing early, early last year and even more during during like I remember when I was at OLX three years ago or something like that people were already using Copilot GitHub Copilot. (25:35)

Alexey: Yes. But not to that extent like today. (25:52)

Mariano: Yes, of course. Like not at the level where you could divide code word like that you could do getting yourself into areas you don't know. (25:59)

Alexey: Yes. We couldn't do this before. We couldn't do it. (26:04)

Mariano: Now we can do it. So I see it happening that AI applications mostly LLM applications popping up in areas where traditionally they wouldn't do AI. You know mostly mobile or front end team doing it. (26:14)

Alexey: Yes. (26:26)

Mariano: And they can achieve decent results. The problem starts to appear when the things gets more complex when they want to start modifying and I think it plays both both sides like when you don't know when you don't have in depth knowledge of what you're doing then all the software engineering challenges appear like how to make sure you're not having a regression. Regression not in the data science way, in the software engineering way where you by introducing a new feature or a new functionality or a new use case or a new let's call it you're solving one problem that that AI model had. (26:32)

Mariano: You're not breaking something else and having evaluations and being able to measure the model and how to do this systematically prevent data leakages those are things that can be learned by an engineer but normally are things that data scientists excel at. (27:14)

Alexey: So for us it's natural. For me I always think okay we have this thing how well does it work and then I immediately think of validation. Validation in the old terminology now we call evaluation but doesn't matter. So we have a thing we have a data set a gold standard data set and we run this thing against this gold standard data set for me as a data scientist this is like 101 so this is something we have to do always. (27:39)

Mariano: Exactly yeah. (28:04)

Alexey: So engineers now have to learn? (28:09)

Mariano: Exactly you have to learn or that's when the specialization plays a role again and it goes the other way around. I can buy code some front end application that to show how it will look. (28:15)

Alexey: I can even use Figma these days. (28:26)

Mariano: That doesn't mean it's a good solution and it will be sustainable in the long term. I don't know what are the best practices when designing HTML components or things like that like CSS. (28:31)

Mariano: I don't know. I will have to learn it completely. So I think it look something that helps a lot is with prototypes. I think that by coding for prototypes to instead of writing down your idea and explain it into in words now you can explain your ideas in in code with a working thing. (28:42)

Alexey: Yeah. (29:06)

Mariano: And or it allows you to fix issues that are let's say you have a nasty bug and you don't have the person that is expert in that around agents can help you navigate those type of situations but the part I don't think it's that and at this point it will change in six months it's hard to predict the future but at this point having someone is not expert in an area means that if they don't pay a lot of attention and only do do agentic coding, the sustainability of that it's questionable. I think data scientists are still needed. (29:06)

Alexey: Because this is actually one of the things people ask, hey do you think it makes sense for me to specialize in data science now or I should choose a different route? So you think it's still okay? (29:54)

Mariano: Yes. Maybe maybe don't get an expert in modeling your your neuron network architecture except that is that that specific niche the one you want to work with. Maybe seven years ago building your your Keras model connecting all the layers and being an expert in building architectures made sense. (30:06)

Mariano: Now it's becoming except that you work in that specific niche where you build networks that that becomes less of a skill but everything else I think it's still scikit learn as old as it is. (30:33)

Alexey: I wouldn't maybe the models don't make a lot of sense but all the tooling for the models all the concepts that introduces. (30:51)

Mariano: Right. (30:58)

Alexey: I think there are still many cases where the plain old like good old logistic or linear regression are still the solution because more than half a year ago if not more anyways I was talking to a company and they were telling me about the problem they have in data science. They were saying like we're trying to predict the how the the price of housing but it doesn't work. I'm like, okay tell me more. They say, okay we sent this to an LLM and the prices that LLM predicts are bad and like wait a minute, are you trying to predict the prices for housing with an LLM? (30:58)

Why LLMs Aren't Always the Best Solution

Alexey: So yeah, I think these are the cases where you actually need a data scientist. (31:42)

Mariano: Yes. Maybe that's a bit of color. I learn data science by reading GitLab. I that's because back then at the university there was or you had a course on neural networks or you had a course on numerical calculus like using computers to do calculus, but there was nothing that will put it all together like now you have in Coursera and these things or what you do say like something that teaches you how to apply this. (31:47)

Mariano: And scikit learn documentation was amazing for that. It was an entry door for me to a world that I still use. I remember there was one diagram I should look for, like how to select a model and what model to use and it was a decision tree I remember. Like, are you trying to play a magnitude or a class? You have many or a single one? (32:29)

Mariano: Nowadays this has become brick and mortar but when I was studying it made a lot of sense. (32:55)

Alexey: Yeah. Well like it's it's good like I think it still makes sense but some of these decisions can be made by an AI system right now. But the question I have for you is you described the process for any business problem that involves AI, machine learning, whatever you need to define the requirements. (33:06)

Alexey: You need to collect the data or you need to even understand what kind of data you have before you collect it or what kind of data you need. Then you create a model or you use an existing model. Then you evaluate this model. (33:25)

Alexey: Then you deploy this model somehow. You deploy this application either through Kubernetes or as a batch job or whatever. Then you operate this model. You monitor this model so you make sure that it works the way you want it to work. I'm just reciting what you told us. I will note it down. (33:37)

Alexey: Now we have AI systems and all these points an AI system can potentially do. But where do you think they still need humans in which steps? Like as a data scientist or as a software engineer where I can rely more on models or more on AI and GenAI and where I still am very very much required and still need to steer the process and pilot the process and tell the model what to do. (34:01)

Mariano: So actually there is one step before working with the model. And that's the problem you're trying to solve. And how you're planning to solve it because many times we start as product manager or somebody arrives or boss or manager. Whoever we need to solve this problem we need to predict the price of the houses. (34:29)

Mariano: Awesome. You brought me a machine learning data science problem. I don't have any questions asked. I start trying to predict. (34:56)

Mariano: And then you investigate why you want that? Because we are doing a mortgage risk analyzer and we use this to predict if the house that the user is giving us for the loan is good enough. Okay. And then the problem is that you didn't want to check the real price. You wanted to check if the house was valuable enough for this loan. (35:08)

Mariano: You wanted a decision, not a price. And or I don't know. I'm just making it up now, but it often happens that the problem you're trying to solve and the stage at which you're trying to solve it is not optimal. (35:34)

Mariano: And by discussing and understanding better the problem at hand sometimes you can come up with better solutions because you can ask for better input or you could solve the problem a bit earlier or a bit later or you can completely and sometimes you don't need AI. (36:00)

Alexey: Yes as well. There was this document from this Google employee. I think it's the rules of ML. (36:27)

Mariano: Yes. And still applies many of the things there like start start simple. Make sure you need AI. Don't be afraid to start with a dummy model. (36:36)

Mariano: So all these concepts the AI cannot do. It's too high level to understand. It needs a lot of context. You need to understand what company you're working for, what the problems of the users you're trying to solve, and how they were solved in the past. (36:49)

Mariano: If if an LLM had all that in the context, maybe it will be able to solve it. But there is a lot of implicit context when you work at a company that the LN cannot get. So that's the first one I will say it's something that humans are not easily replaceable. (37:02)

Alexey: Yeah, we have a lot of implicit context that we get by having conversations. (37:16)

Mariano: Mhm. (37:22)

Alexey: I was actually hoping you would say define requirements as the first step but the answer you gave is way better but still this is the answer. So to me, defining the requirements is like translating from business requirements to machine learning terms. (37:22)

Translating Business Needs to ML Requirements

Alexey: This is where we still need data scientists in my opinion. It's the same as you said. I think I agree with you Alisa because actually the requirement gathering step is the place where you actually ask these types of questions like you don't blindly collect requirements. You also challenge if the business requirements make sense while collecting them. Absolutely. (37:39)

Alexey: This is what I loved about data scientists because as a data scientist you are way closer to product than as a software software engineer at least this was my experience. So I imagine there are software engineers who are like product engineers who are almost like product managers but I as a data scientist I was always sitting near product managers. I was always very close to them trying to really understand what we are really trying to solve and this was very very helpful not only in my career but also like a day to day job. (38:07)

Mariano: Yeah look I am at the risk of oversimplification but I many times say that what we do in data science is boxes machines that make decisions. They decide things so how those decisions are made and how they are made have to be highly connected with the product you want to deliver. (38:40)

Mariano: So I yeah I think we are preaching to the choir here. Alex the other part sorry and the other part that still needs human but not data science is the ground truth. Okay things have improved a lot. You can have self supervising models. (39:09)

Mariano: Well, you have the transformers as a brilliant example, but you can take that approach to your own data. You can self train, self supervise your model or your application with your own data to to build maybe a good embedding that the data you don't you you already get a lot of goodness or you can use LLM as a judge these days. But there are still lots of use cases where you need ground truth and you can always sometimes you can get it in e-commerce at least you can get it directly from what the users do. You know the feedback of your user can be labels but there are many many situations where not where you still are missing the ground truth data. (39:28)

Mariano: So there are still situations where you need humans to label. So the whole approach to labeling in which you prioritize labeling the labeling strategy, let's call it like that labeling strategy is still something that I think data scientists can add a lot to the table. (40:22)

Alexey: Especially for you because you were in search and recommendations and I think these are the domains where you really have to know how to label how to evaluate the systems you run and how to label the data there? Because this is partly where the signal is coming from in search systems and now many people already discover these things that people in search have been doing for so long that we actually need to have proper feedback loops and like human in the loop all these kind of things. But you've been doing this for a very long time? (40:43)

Mariano: Yeah. Like normally we call it we have explicit feedback or or implicit feedback in the in the in the domain. (41:15)

Managing Explicit and Implicit Feedback Loops

Mariano: So explicit feedback is whenever somebody tells you that this is the thing like for example I don't know classic example is okay you have a chatbot and you make a suggestion and the user tells you no that's wrong or it gives you a thumbs down. The thumbs down works as an explicit feedback and implicit feedback is okay you didn't tell me but by using my recommendation, you are telling me that what I recommended made sense over the thing that you didn't select. (41:28)

Mariano: But yes, and yes, active learning is a topic that still is relevant. Okay, it's maybe a bit advanced. But I still haven't fully mastered it. I think there is so much to exploit. (42:05)

Alexey: But it makes a lot of sense, you know, how to prioritize the things to be labeled that bring the most to your model? Because you can waste a lot of time in labeling. (42:27)

Mariano: And it's the thing that usually kills projects in my opinion. Oh no, this is too hard. We cannot get the ground truth. So you cannot build a model. You don't have an application. Right. (42:38)

Alexey: Yeah. So the title of this talk or event whatever is called from notebook to production right building end to end AI systems and I was thinking so we didn't mention any notebooks and maybe we need because this is the title. I was thinking since you're somebody who is involved in doing projects end to end perhaps we can take any of the projects you're working on and then you can tell me the story like how it started how you collected requirements of course to the extent you can discuss these things but maybe we can try to take one of these projects and discuss end to end and also for some people it would be interesting to know the technologies you used at some of these steps. At the beginning maybe you didn't need a lot of technology, mostly like I don't know word documents or whatever you used but still it would be interesting to know what kind of toolings you use for each of the steps. (42:50)

Mariano: Great so we have two applications that I mentioned one of them the one you like the most I think the video to add. This one isn't fair and doesn't use that much LLMs. Actually we use much more visual models. (43:43)

Mariano: While the other one is much the one that creates the videos is much more maybe let's talk about that one because it's more hyped kind of. Yeah okay let's go to the one based on the GenAI so on this one I'm not the only one working on it. This is a team effort. Actually, recently I've not been that much involved. So, I will present a bit of what I know, how it was working a couple of months ago. Okay? And what can be shared. (43:59)

Mariano: So how does it go? So, it's a generative it's a it's a purely generative solution. You are here. There is no right or wrong output. There are better or worse. (44:33)

Mariano: As I said, this application based on the images and the description and the title and the item being sold at OLX, it creates a video, a compelling video explaining what's being sold, the characteristics, and in this case, it's focused mostly on cars. So, it shows you, okay, welcome. Okay, not welcome. But hey, this is a Toyota Camry from 2009. It is 10 years old. (44:56)

Mariano: It did all the services the concept. (45:32)

Alexey: So this is the ad in the listing description? So this is what you exactly and the attributes of all the structured fields as well. (45:38)

Mariano: And as it narrates this, it shows the different angles of the car and we make sure that the angles of the car match with what is being said. (45:44)

Alexey: Okay. So, you have to make sure that audio and video are in line? (45:58)

Mariano: Correct. Yes. That they narrate a story. Then actually we had to do workshops with professional video editors that work. We know with Adobe Premiere that they go record the things and what they use to decide how they show the car. (46:03)

Mariano: Even the cuts, I learned a lot about cinema during those workshops, the transitions like okay, how you cut, how you mix the audio also from from a scene to scene. So going back to how we started. So actually was not around. This was when they came out of a hackathon. (46:24)

Alexey: So there was no hypothesis at the beginning because usually you kind of come up with a hypothesis that this is something that would be interesting for our customers? (46:44)

Mariano: Yeah. So actually it started more. It came more from the AI team right but the hypothesis was around and it was the product manager saying like new generations look like they like video. They consume video, they spend most of their time on video platforms and we don't have video format. How can we leverage what we have, so we have videos? (46:56)

Mariano: So that's how it all started, how to generate videos from the assets we already have. So then at the beginning it was more like building a it looked like a PowerPoint slides, okay, with one narration on top of it that tell everything. But as we started and this was working very closely with the customers, we started showing this to customers and they gave us feedback. (47:22)

Alexey: Customers are dealers? Or the dealers in this case. Yes. (47:50)

Mariano: The car dealers. This will be people that will normally create videos that they prefer in the videos. (47:57)

Alexey: So there are already dealers who create videos? (48:04)

Mariano: Yes. Or even the ones that didn't do it but wanted to, what was their preference, what was important, what was annoying to them, what they didn't want to be mentioned. So something we had to work early on so in a nutshell a (48:06)

Architecture Deep Dive: Image Description Logic

Mariano: A little bit about the architecture. Okay a bit technologically speaking, this was launching because it was the predominant library back then that will connect with the with the LLMs and ask for different things to describe the images in the ad to collect the description, put it together, build a context which we will ask for a narration being hey give us a a script to say out loud for this ad. It will be more or less this amount of scenes like let's focus during each scene on different aspects, interior or the exterior or the safety things. We will allow the agent to freely explore a narration. (48:26)

Alexey: Is it correct that you didn't do any fine tuning because I see a question from Ahmed? He's asking if you needed to fine tune anything or you just used it for this project. (49:15)

Mariano: No, it was mostly prompt engineering. Okay. So we had evaluations and we used LLMs as a judge to make sure that we will always be truthful to the input meaning the ad that we wouldn't hallucinate or invent things or say something it isn't. We had other requirements from the sellers like for example not to use marketing lingo, marketing slang like you won't miss this opportunity or know keep it tone down keep it much more factual right just present the card. (49:27)

Mariano: Don't mention things about the price for example because price can change and the video we cannot change that fast. So things like that and so on. So, we started adding in evaluations, things like that to improve. We had combinations. So, coming back to the fine tuning, so we had combinations to create a script is not a difficult task. (49:55)

Mariano: Like for an LLM, if you have the right context, writing a script that tells you about the car is something that you can do. It's purely textual. These things work well. Then when it comes to images we had to combine it with others because this is not a purely agentic approach. There are we we take back control sometimes back from the agent and do our self things and there are steps where we analyze the images with our own models historical models we have at OLX and we try to understand what's in the image as well using that I don't know YOLO models and other types of CLIP models as well. CLIP actually is totally I don't know I don't know if you teach it in any of the courses but CLIP models are something I highly suggest everyone to learn about. They are (50:36)

Mariano: Maybe we should, so they are still quiet, yes I think they are. What do we call it in English like a warhorse? They are very efficient. They're very lightweight and they can solve a lot of things. It's good to create an ensemble like to to use this in combination with. (51:33)

Alexey: It's interesting what we call lightweight these days. (51:53)

Mariano: Yes. But for me look for me if it runs if you compare. (52:00)

Alexey: Yes. Yeah. (52:06)

Mariano: If it runs under second for me it's lightweight these days. Yes. Um the visual transformer architecture, the ViT is quite powerful like I think it's a it's a it's quite good in my opinion. (52:06)

Alexey: So all that stuff you were doing in a notebook I assume like if we talk about? (52:12)

Mariano: Yes. So there were things in a notebook. These days look and that's something actually I regret putting in the title. These days notebooks I wouldn't say are dying but they are becoming less and less relevant. Okay. Why? (52:30)

Mariano: It's a combination of things. So this is my personal preference. Nowadays notebooks I use for a quick exploration or building a report. Yeah. Right while my main model or running a large evaluation that I usually run like the main pipeline right besides the script that goes into production many times what we have is like a CLI tool right that will run the model end to end. (52:42)

Alexey: Don't you first like in order to know what you want to put in the CLI tool you need to do some exploration? (53:12)

Mariano: Yes exploration and this is what you use in a notebook. But many times our applications are compositions or many many models. I see right, we imagine that we analyze the images we call LLMs we call text to speech, sometimes a speech to text we combine things all over the place we even run in the other projects we line we run linear linear solvers. So there are several things you run and sometimes when you work in a notebook you don't want to run the full thing you want to have an isolated place where you run just that specific model you are adding or tuning or testing and then when you want to test it then you run the full thing with something like I don't know it could be MFA you know some some more experiment set up. (53:20)

Alexey: To me it seems like still the way you first experiment with something in a notebook then you package it as a CLI application and then you already modify the CLI application. (54:12)

Mariano: Look sometimes these days look that's with a and what's the other topic we have in the document? Agentic coding. So you could just create a CLI application without skipping the normal stuff or Streamlit. Creating Streamlit applications not sure if you're familiar with them. It became super cheap. (54:23)

Mariano: So now if I want to create a report or something that allows me to explore this new model I'm doing I can quickly create a Streamlit app only for that and I think it was useful for the reproducibility. Notebooks were super useful because a report could be reproducible but now because it was costly to go and reimplement everything that was in the notebook again. Yes. But now you can have a report in a Streamlit app with not a lot of effort and there is a disadvantage with notebooks and agent coding also. (54:42)

The Declining Role of Notebooks in Production

Mariano: They're way more costly. And they and still the additions sometimes they still break the whole notebook. There are servers that allow you to connect to Jupyter. No, the Jupyter kernel. (55:28)

Mariano: They're still Marimo this Marimo. (55:51)

Alexey: Marimo. Right. This Marimo I heard good things about it. I haven't tested it yet. Yeah. Because it's Python code. You don't have any reason. Exactly. (55:56)

Mariano: Yes. I think the JSON part of the notebooks is becoming more like a nuisance. (56:02)

Alexey: Well in my case what I did, I just created a new format for Jupyter. I just asked Claude code to figure out how to get rid completely of JSON and have a plain Python file. Like after one hour, I had a working thing. Like now I have notebooks with plain Python code, which is amazing. Like a few years ago, I would never even dream of doing something like this. Now I have no idea how it works to be honest, but it works. (56:10)

Alexey: Regarding notebooks, I also have one more I have a rule in my in my agentic coding that is that any code that is not intended for the report itself like if it's for plotting you can put it in the notebook but everything that is analysis logic or functionality to put it in .py files next to the notebook. So the notebook ends up being mostly imports and calling some helpers. That has for me the advantage that the notebook stays way smaller. It's mostly output. (56:37)

Alexey: And the scripts that are used I can easily reuse later if I want to make reference. Right. (57:16)

Mariano: Yes. (57:28)

Alexey: Yes. Do you have some time now? Because we kind of have to finish. (57:28)

Mariano: Yes. I can stay a few more minutes. Yes, I don't know if there are questions or there are some questions and I tried to include these questions as we go through the conversation. Um even though I wasn't explicit in calling them out but yeah thanks Pablo for asking many questions but maybe you can quickly cover some of these. So for example, one of the questions Pablo asks is do you use any platform for evaluations or how do you go about these evaluations? (57:33)

Mariano: Yeah. So we use Arize. Okay. (58:04)

Alexey: Arize. Okay. Yes. I know them. (58:06)

Mariano: At some point we were between them and Langfuse. This is about the LLMs we still use MLflow for more traditional things. Some things are being moved also to Arize because Arize can do the two things. It can do the classic ML and LLMs. (58:12)

Mariano: So Arize is really good for tracing also the tracing getting the instrumentation and monitoring and so yes. So, how did I answer the question? (58:28)

Alexey: Yes. I wonder how well it works with video. (58:39)

Mariano: Okay. So, with video, actually video is very complex because you cannot really use LLM as a judge. You have to use humans here. (58:45)

Alexey: Well, you? (58:53)

Mariano: Yes. Actually for the videos what we do is we evaluate each of the components separately. (58:53)

Alexey: So that you can automate right? (58:59)

Mariano: Yes. A bit. Yes. Like the evaluation of the but still like it has to be synchronized like audio has too much video and then like are models there yet to be able to do this or we need to use cross? (58:59)

Mariano: We have some tricks actually to make it work. Not sure how much I can share that but we have some tricks internally on how to achieve that synchronization. It has to do with the order of things in which you produce the artifacts, the audio, the images you select and so on it took us some effort. At then it's prompt engineering plus clever engineering. (59:18)

Alexey: So you didn't need to do any fine tuning because I imagine in some cases you would need to create a model that would output both at the same time on there and think but in your case you just did clever engineering of steps plus clever prompt engineering? (59:41)

Mariano: Yes yes actually I don't want to like down, don't say it it's still a lot of effort. (59:58)

Alexey: But you're right and actually I think it's an important skill to know when to leave the right when to leave the LLM and solve it yourself. (1:00:04)

Mariano: There we have some rules like sometimes too much agentic is not a good idea. If you can solve something with code with rules, you output the structure output and you solve it outside that's good. (1:00:11)

Alexey: Then maybe the last question. I know OLX was hiring juniors and trainees a lot. Has it changed now because I know the market is kind of not junior friendly these days? So I wonder how is it for these days. (1:00:21)

Mariano: Let me check let me see I I know we were. (1:00:41)

Alexey: Do you work with students because I remember I worked Like do you personally work with students because I I used to work a lot with students. (1:00:47)

Mariano: Yes. Look not as much as we did in the past but we still have for example programs with universities where we collaborate with them. I think we have we have openings for where is let me. I was like checking here not sure. Yes we still have if you go to the site there is openings for junior. (1:00:54)

Mariano: It's join. Right. Or what's the link? Has it changed? Just write on the search OLX jobs OLX careers right I can share also with you here and and you can find it right I put it here in our chat careers.olxgroup.com. (1:01:15)

Mariano: Yes. Yes. But if you just start searching jobs at OLX, be careful not to land on our jobs site because we have a unit where we have e-commerce for jobs. Well, it's like LinkedIn for jobs. (1:01:39)

Mariano: But yeah, you can find there are quite a few open roles if you want. But look, I don't think the market will stay like this forever regarding juniors. I think this is a personal view. I think there will be a moment that the agentic ROI will plateau like one year ago. We will see it all the time right the improve and then companies will have to have an advantage over other companies and having more people is always an advantage and you need to hire juniors. (1:01:52)

Alexey: Yeah. Okay. So, let's hope it happens. Maybe the last thing before we wrap up. So, there was a question about technologies. We cannot go into a lot of details about the technologies you use and tools you use. But maybe briefly so you mentioned Arize, you mentioned MLflow. Are there other tools you use in your day to day? (1:02:28)

Mariano: Well, I mentioned Streamlit. (1:02:47)

The Modern Tech Stack: Fast API, UV, and Arize

Mariano: Streamlit. Mention we did fast well the what I consider staples now Fast-API to build APIs. If we use LangChain sometimes Pydantic AI for LLMs what else do we use? Do you still use it? Yes of course a lot yes batch the the stack is Kubernetes batch sometimes is even Kubernetes itself for cronjobs. (1:02:53)

Mariano: What else? Maybe at higher level at our in our lake we use Trino and we use also Iceberg which is for the ones interested is how to store how to convert S3 files into into into tables but then from ML side YOLO or variants of YOLO Dino things like that like object detection libraries hugging face is a beautiful resource actually something I didn't touch like the the YOLO MC the sorry the hugging face MCP combined with the arXiv you know the I don't know how to pronounce that the one with the papers right arXiv arXiv the two of them combined allows you to investigate quite quickly for models to solve your use case right if you know more or less what type of model you're looking for. It it helps you a lot. (1:03:26)

Mariano: Nowadays much more Nvidia TensorRT making sure to support shipping. (1:04:31)

Alexey: You also need to serve your models yourself? Some of these models LLMs I mean. (1:04:37)

Mariano: Some of these models. TensorRT is for LLMs but not only. No not only for any CLIP. (1:04:44)

Alexey: Yeah. So we use any vision model that will work on Nvidia right so you get the optimization of Nvidia. What else do I have? (1:04:55)

Alexey: I think it's already a large list. (1:05:07)

Mariano: Yes. (1:05:13)

Alexey: Yeah. So there was a comment like who can buy UV or UV you have to if somebody is not using UV they have to stop doing whatever they do right now like drop everything right now and go spend next 5 minutes learning UV you don't need more than that changed my life. It's the tool I didn't know I needed. Previously you could start installing dependencies and then you have an entire hour of free time right but now you install it and it's done in 5 seconds. (1:05:13)

Mariano: Yes. (1:05:37)

Alexey: So there was a question if I can share listed tools. So YouTube will process it and generate a transcript. So what you can do is I can copy the transcript to chat and ask GPT to give you the list of tools. So this is your home task. So thanks Mariano. Sorry for taking a bit more of your time. I really appreciate that. (1:05:44)

Alexey: There were some questions from Pablo that we didn't cover that would require more time but I got very interested in the discussion I had with Mariano. I saw the questions from Pablo but I was too interested in continuing this. Well maybe you can discuss this in Slack. You can reach me on Slack. I'm in the data talks club. So just a question for you Mariano. (1:05:57)

Alexey: Maybe you think okay, maybe this is something interesting I can share on my LinkedIn or whatever. So Pablo was asking to describe the data and ML pipeline used in your projects and tool used. I think we covered that more or less briefly. Another question from Pablo was please mention three to five technical challenges you face in your work. So maybe you can make a LinkedIn post and we all go and upvote it. (1:06:27)

Mariano: Good idea. (1:06:47)

Alexey: Thanks everyone. Sorry Pablo, but I hope I managed to cover some of the questions you asked. Thanks a lot for asking these questions. I on purpose didn't cover some of the questions that everyone some of you asked because I found them off topic. You can bring these questions to our Slack and I will be very happy to discuss these questions but these questions aren't really good for Mariano and this discussion today. (1:06:52)

Alexey: So let's take them offline to Slack. And Mariano, thanks a lot. I always like talking to you. I really enjoy this conversation. I wanted to talk to ask you much more but I think you already gave a lot of information and one confirmation I had was that many people say that the role of an engineer is kind of full stack and I got another confirmation from you about my observation. (1:07:04)

Alexey: Thanks Mariano. Thanks everyone. (1:07:34)


DataTalks.Club. Hosted on GitHub Pages. Built with Rustkyll. We use cookies.