Data Engineering Zoomcamp: Free Data Engineering course. Register here!

DataTalks.Club

From Testing Phones to Managing NLP Projects

Season 11, episode 1 of the DataTalks.Club podcast with Alvaro Navas Peire

Did you like this episode? Check other episodes of the podcast, and register for new events.

Transcript

The transcripts are edited for clarity, sometimes with AI. If you notice any incorrect information, let us know.

Alexey: This week, we'll talk about career transitioning from quality assurance to machine learning. We have a special guest today, Alvaro. Alvaro worked in the cell phone industry as a quality assurance engineer, but got tired of it, spent a few years unemployed, fell in love with machine learning and eventually got hired by a consultancy company. (1:15)

Alexey: Right now, Alvaro is managing machine learning and NLP projects. I got to know Alvaro when he took part in our machine learning and data engineering Zoomcamps, via our courses, and he published some awesome notes. If you're taking the course right now, you must have seen his notes – they're really amazing. If you haven't, please do check them out. He also helped other students a lot during the course. I'm very happy to invite Alvaro to this interview and talk about his career journey. (1:15)

Alvaro: Thank you very much for having me. (2:03)

Alvaro’s background

Alexey: You're welcome. Thanks for joining us. Let's start with your background. Can you tell us briefly about your career journey so far? (2:05)

Alvaro: Right. I'm from Barcelona, Spain and I studied here. I studied informatics engineering, which in the old Spanish education system used to be a mix of computer engineering and computer science. We did lots of programming, but we also took a look at the hardware side of things. I did that. I started working for a cell phone company for a bit here in Spain. I actually did quality assurance – I received prototypes, I tested them. (2:13)

Alvaro: Then this company changed. They didn’t change hands, but they changed the founding model or something. They started focusing more on Latin America. After that, I took a gap year and then they actually called me and said, “Hey, would you like to come to Mexico and work for us again?” And I said, “Yeah, sure. Why not?” I went there, I worked there for almost four years and I decided that I did not like that kind of job anymore. I did not enjoy the industry at all. (2:13)

Alvaro: I decided to quit and went back to Spain. I was unemployed for two years. I studied in the meantime until I found machine learning. And I eventually managed to get hired by a company working in machine learning. (2:13)

Working as a QA (Quality Assurance) engineer

Alexey: Interesting. What did your day look like when you worked as a quality assurance engineer? You mentioned that you received prototypes and you tested them. Was it like, I don't know, getting a phone and then clicking around on it? (3:41)

Alvaro: Essentially yeah. It's not like software QA – like for software testing. I did not do any kind of unit testing or anything like that. I just received a prototype and I did field testing. I went outside and checked the GPS, checked the map’s navigation, and how it worked. I tested the sound. I tested the microphone recording, I tested all the sensors in the phone. I did battery testing to see how well it performed. I checked for translation issues. (3:56)

Alvaro: The way these kinds of companies work is, actually most of the design and construction and pretty much everything is done in China. If you have money and you want to build your own brand, what you do is go to China and then you contact an ODM – a design house – and they will do everything for you. You only have to put in the money, you have to put your branding and you have to take care of the importing and the paperwork in your country. You can find that many different small unknown brands will sell the same phone in different countries, just with different branding. They just went to the same design house, where they already offered a ready-made solution and that's how it works. (3:56)

Alvaro: Sometimes you can also find exclusive designs, which is what my company did. We actually had designs that no one else had, but everything was still designed in China. I actually would have liked to go to China and get involved in that part of the design and development. Luckily, it was not meant to be. I decided to change careers and that's it. But yeah, I just received a prototype, I tested it and then made a report and sent it back. Then I would get a software update, which I would flash onto the phone and test it again. That's pretty much the workflow. (3:56)

Alexey: I guess you also had some sort of test suits – things you need to test, some sort of checklist, right? (5:47)

Alvaro: Yes, of course. (5:53)

Alexey: So it was some document. It wasn't like, “Okay, let me think about what I have to test today, right? Let me test the GPS.” [chuckles] (5:54)

Alvaro: I mean, at first, it was kind of like that. [laughs] Then, as we got more professional, we started doing our own checklists, and we wrote our own tests. In particular, for Android phones, you are supposed to pass… you have to certify your phone with Google if you want to have Google Apps in your phone. Otherwise, you're not allowed to. And you have to meet certain things and you have to pass certain tests. So that's called the CTS. I don't remember what it stands for, actually. [chuckles] It's been a while. (6:00)

Alvaro: But I actually used to do that as well. Most of the CTS testing was carried out in China, but sometimes I would have to repeat those tests back at home and I would repeat those. Most of those tests, at least, since some of them required special equipment, which I did not have. There are also more specific tests like RF testing, which you need a special lab for, which, of course, I did not have either. (6:00)

Alexey: I can imagine that at some point, it becomes a bit repetitive. You basically need to run through the same checklist with a new firmware update, right? Then it's like “Okay, let me check the GPS again. Let me check this and that again.” (7:00)

Alvaro: Yeah, it was very repetitive. Sometimes you would get new tests, because new requirements would come in. For instance, we started working in ‘voice over LTE’ back in Mexico, so we had to do new tests for voice over LTE. Stuff like that. New requirements needed new tests. But essentially, it was all testing and refinement of the previous tests. (7:14)

Alvaro: Sometimes the most exciting things would be like, “Oh, we have to do this specific field testing, which is: get in a car, start a call, and then drive a specific route. You would pass the test if that if the call would not drop, essentially. You would also call the carrier – an engineer in the telephone company. They would record the actual test as well. They would get all the data and they would tell you whether you’ve passed or not. Sometimes those routes would change and it was like “exciting”. [laughs] But that's pretty much it. (7:14)

Alexey: I guess that's what made you decide at some point, “Okay, it's too repetitive. I want to quit.” Right? [cross-talk] (8:10)

Alvaro: It's more than that. But yeah. I mean, there were issues in the company – I was not satisfied with our output and how the company was developing. Lots of stuff. But yeah, the work itself was pretty repetitive. The people I worked with were very nice, but I did not enjoy that kind of work. (8:18)

Transitioning from QA to Machine Learning

Alexey: And then you decided to quit. But then you also mentioned that you were unemployed for some time. So you just quit without looking back – without looking for the same type of job to continue. How did you find the courage to just leave? (8:35)

Alvaro: I cheated in a way. [chuckles] Luckily for me, I have a very strong family support system and I had a bit of money saved up. It was scary for me in the sense that I did not know what I wanted to do with my life yet. But I was not scared of “Am I going to eat tomorrow?” Or “Am I going to have a roof tomorrow?” That was not an issue. I quit and I went back to my family. They were very understanding and I started studying right away, after a month-long break or something like that. That was my longest vacation, sort of. [chuckles] (8:50)

Alvaro: Then I started studying front-end, actually. I started looking into front-end development. Then I quickly realized that I did not enjoy that at all. I looked around at what I could study and then I fell upon machine learning, which I was interested in, and I joined a postgraduate degree course and I loved it. I really really enjoyed it. And here I am. (8:50)

Alexey: So you kind of knew that you wanted to work in IT? You mentioned frontend and then you kind of looked around at what else there was in the IT field and then you eventually ended up doing machine learning. Right? (9:57)

Alvaro: I mean, I wouldn't be opposed to work in a completely unrelated field. But I do not know anything about any other fields. [laughs] IT is what I know, essentially. IT is such a wide field, such a huge field that you can become specialized in a very specific thing and just perform a fulfilling career there, in that specific field. I did not feel like I would have to move because I felt like there was so much more to explore. So yeah, I didn't contemplate it at first. (10:11)

Alexey: Why machine learning? You mentioned that you were doing frontend and didn’t like it, and then you came across machine learning. So what did you like about it? (10:51)

Alvaro: It was challenging. I liked the math. It was pretty hard. But I felt like, “Wow! This is substantial,” because most of what I did before in QA, I did not really apply what I had learned back at school. I had never worked in software development before. I had never done anything of the sort. I was essentially writing Excel spreadsheets and that sort of stuff, which I did not enjoy. (10:59)

Alvaro: When we started studying and we had to do the project for the course, I realized that I was having fun. [laughs] That's it, pretty much. I was already interested in artificial intelligence. Because I feel like there's so much more to do yet. I felt like this is a very wide field that I can grow in. I think it's worth diving in and studying it and getting good at it. That's it. There are so many more things that I'm interested in as well. But, you know – you can't do it all. So I chose to get into this and to do it properly. So that's it. (10:59)

Alexey: Sounds like you didn't really have a plan, right? You just went “Okay, I kind of like it. Let's see what’s there.” And you just explored this one step at a time. Or did you actually have some sort of plan? (12:13)

Alvaro: My plan was… Well, first I wanted to quit and I wanted to have time for myself – to get myself cleared out and figure out what I wanted to do with my life. Then I thought, “Well, I know I'm an IT kind of guy because that's what I know and I actually like it.” I'm interested in the field. I keep up with the tech news and everything. So I decided “Let's see what else there is.” I was pretty sure I did not want to do QA. I also did not want to do regular IT stuff, like sysadmin stuff. (12:24)

Alvaro: I did summer jobs before when I was younger, and I didn't really enjoy it. So I started looking. I wanted to get into the more development stuff. When I tried out machine learning and I learned more about it, I realized the width of the field and how much there was left to do. I said, “Yeah, this is a really interesting field. I want to get into it.” So I did. (12:24)

Alexey: Okay. So can you tell us how exactly the journey looked like for you? From the moment you realized “This is interesting” to the moment you started looking for a job in this field. (13:32)

Gathering knowledge about ML field

Alvaro: Sure. Before I joined any courses, before I started anything, I asked my friends. I keep in touch with a lot of university friends. Some of them are in embedded development, some of them are in machine learning, and some of them are in other similar fields. I just asked around, like, “What do you do? What kind of things do you do? Do you like it? Would you change something?” That sort of stuff. (13:42)

Alvaro: A couple of friends, when they talked to me about machine learning, recommended things like “You should read a couple of books. Check this video out. And then probably join a course or two if you're interested in joining a proper course, but maybe not a full master's degree. I have a couple of recommendations here in Barcelona.” I said “Okay, thank you very much.” I looked at the materials they suggested and I joined the project course I was suggested, because the head teacher was a very famous guy in the University I studied in. So I joined it, and I loved it. (13:42)

Alexey: That was just one course right? It wasn't a degree. (14:51)

Alvaro: No, it wasn't a degree. It was one course. It was a five month long course – it was a pretty long course – but it was a postgraduate course. It's not a full degree. So I finished it and I said, “I should learn more. What should I do next?” The courses started in November or December and then finished in April/May. So I thought, “Well, I cannot join any course or any degree right now because it's the end of the school year. So what should I do?” Looking at my options and after talking about it with one of my buddies in the course, he suggested that I join a summer school course, which I did. (14:55)

Alvaro: It's called Neuromatch Academy. It's a nonprofit association, which does these neuroscience conferences and they also do courses. It's meant for neuroscience, but they also have this one-month-long module for deep learning for neuroscience. Since I was looking for projects to build up my portfolio and to get more actual experience developing neural networks, I said, “Let's join it.” That was last summer. Not this year – last year. It was a pretty tough course. I liked it. I don't remember almost anything from it [chuckles] because it was neuroscience, which I don't know. I had issues keeping up, but it was interesting. (14:55)

Alvaro: After that, another friend of mine who used to live in Berlin, suggested that I join DataTalks.Club and I said, “Oh, sure, I'll take a look.” And I joined the Machine Learning Zoomcamp, which I really enjoyed. Then I joined the Data Engineering Zoomcamp, which I also really enjoyed because it was a part of the equation that I had almost no knowledge about. (14:55)

Alvaro: Then there was the Machine Learning Ops, which I had to drop out because I was already working and I could not keep up with everything. [laughs] But yeah, that's it. I did those three things after the postgraduate course. My goal was to build up my portfolio by doing projects, maybe joining Kaggle. [cross-talk about pronunciation] Perhaps doing a few of those, which I actually never ended up doing because I got hired before I had the chance to start. (14:55)

Alexey: So you didn't have a portfolio? I mean, apart from the course. (17:25)

Alvaro: Yeah. Apart from my GitHub, which is my notes and things from the courses and the stuff from my postgraduate course as well. That's it. I did not have an actual portfolio. But, apparently, we’re in high demand. [chuckles] They’re trying to find as many people as possible to get to work right now. I guess I was lucky in that regard. (17:28)

Searching for an ML job (improving soft skills and CV)

Alexey: What was the most difficult thing for you? You did a postgraduate course, then you did summer school, then you did two Zoomcamps from DataTalks.Club and then you started looking for a job, right? Or when did you actually start looking for a job? At what point? (17:57)

Alvaro: At the beginning of this year, perhaps. Besides the technical knowledge, like how to do actual machine learning, I actually had a few gaps about how to approach an interview, how to prepare your curriculum vitae – that sort of stuff. Soft skills that you need to actually get hired, which are not actually related to what you're supposed to work in. But those are hurdles that you have to pass anyway, if you want to get hired. I hired a coach and he helped me figure out stuff. (18:18)

Alvaro: He got me started on how to talk to people, how to not be confrontational, but at the same time, how to plant your feet in the ground and how to defend your interests. Also how to prepare my CV, because also CVs are becoming more… I actually showed you my CV and you said, “I prefer the kind of CV which is like a list of things,” which is what I used to have, but then my coach said, “Yeah, that's fine. But nowadays, most people don't want to actually look at a list of things because they have so many things to sort through – they want something visual, they need something visual.” (18:18)

Alexey: Interesting. [chuckles] (19:43)

Alvaro: Yeah. So I did my CV in a more visual manner. That sort of stuff. Anyway, I managed to find some opportunities, through some contacts as well, like “I'm looking for a job.” – “Okay. I have this guy who's interested. Let me talk to him and maybe you can get an interview.” – “Cool. Yeah.” I still applied through LinkedIn, but then I also got in touch with my contacts, got in touch with some more people, and I eventually started some processes. I started one process before I actually was serious with my coach. That did not go well because the person who interviewed me started asking pretty tough questions – things like, “So you have a data frame, but this data frame in pandas actually exceeds the memory size that you've got available in your system. What happens then?” And I was like, “I don't know, man. I guess it will crash or something, but I'm not sure what will happen then.” [chuckles] You know, that sort of low level stuff that I was not prepared at all to do. He told me, “Thank you for the interview, but you're in a very junior position and we were looking for someone more experienced.” And I was like, “Yeah, sure. Whatever. Thank you for the opportunity.” But that's it. (19:44)

Alvaro: Then I did a couple more processes. The two first ones were an interview in order to get to know the people and then I was supposed to do a technical exercise. The first one, after the exercise, I got rejected. They told me “Thank you, but we're not interested.” The second one, they actually made me an offer, but that was after I had done the third interview – the third process – which did not have a technical aspect beyond a few various questions that they asked me right there in the interview. They made me an offer in that third interview. And I said, “Sure, let's go for it.” But maybe if I had received the offer before I had done the third interview, I might have accepted it. So who knows? Maybe I would have been working for a different company right now. All of these companies, except the first process, which did not go well – but these three last processes were NLP-related, which is funny. It's interesting. Because I did not actually explicitly look for NLP projects. [chuckles] It's funny how that works. (19:44)

Alexey: Did you actually have any NLP experience? In our Zoomcamps, we didn't cover that. [Alvaro agrees] Did you study this separately? (22:02)

Alvaro: I did some NLP back in my original postgraduate course. I was familiar with it. I was not an expert, because actually the NLP part, I did not enjoy it as much as some of the other parts of the course. We had different teachers and the teachers we got for that part weren’t my favorite. But yeah, I did have some experience. It didn't come as something completely new for me. (22:13)

Data science interview skills

Alexey: We have a few questions. The first question is, “What kind of soft skills did you have to prepare for your data science interview?” You mentioned not to be confrontational and things like that, but what were these skills exactly? What did they look for and what did you need to prepare for? (22:38)

Alvaro: I worked through that kind of stuff with my coach. The way we did it was with role playing exercises. If you have a friend or you know someone who would be interested in helping you out and maybe could play the part of being a tough interviewer and put things in a difficult manner to you, I think that would be a very cool exercise to do and to prepare for it. But essentially, it all depends on how you are. I'm the kind of person who… I don’t know how to sell myself to others. I undersell myself. I look at the stuff I do and I think “Yeah, I'm not very good at this.” But then I look at some other people's work and I realize, “Woah! This guy sucks. Maybe I'm not so bad at it.” (23:02)

Alvaro: But I still don't have the confidence, perhaps, to upsell myself in a way that is actually more objective compared to what the actual quality of things are. For me, one of the most difficult things was to actually not undersell myself – not being “Yeah, I'm not that good at this.” You cannot say that to someone who is interested in hiring you. But you actually do not want to lie either. I did not want to sell myself in a way that did not reflect reality. That was really tough for me. (23:02)

Alexey: But your Zoomcamp projects were amazing. Your notes were excellent. [cross-talk] (24:35)

Alvaro: My notes are fine. My projects – not so much. [cross-talk] (24:44)

Alexey: Come on. They were good. (24:47)

Alvaro: I mean, compared to some. Some people were amazing. Some people were like, “Wow! This is a very, very good project.” I think my project was average in both camps. But thank you very much for the compliment. I appreciate it. [chuckles] (24:48)

Zoomcamp projects

Alexey: Can you tell us about these projects? (24:57)

Alvaro: The projects I did for the Zoomcamps? (25:02)

Alexey: Yeah. (25:04)

Alvaro: Oh, man… I don't remember already. I forgot. [laughs] The first one was… wait, let me remember. Let me think. The project for the machine learning Zoomcamp was: I found a dataset, which was kind of fun – it was a speed dating dataset. That dataset had a lot of features, so many features. It was interesting because, what you had to do is – I believe that the target feature was whether they would match, essentially. But there were a few very dependent features from each other. (25:05)

Alvaro: The exploratory data analysis step was of very high importance, specific for that project. I think that I could have improved – I could have done much more in that regard. But it was fun. It was a very fun project to do because it was something completely unexpected. It was very funny. I found that dataset by chance. For my second project… Was that the midterm project or was that the final project? (25:05)

Alexey: I think that was the midterm. (26:17)

Alvaro: Midterm, right? [chuckles] I cannot remember either. Let me take a quick look at my GitHub. (26:20)

Alexey: Yeah, I think you've demoed the midterm project, right? If anyone is interested, we have a demo from Alvaro in the playlist, where he shows the project. (26:26)

Alvaro: Oh, yeah! My second project was an image classification task. I had a dataset where it was a bunch of vegetables – I had fruits and vegetables. It was a very standard, run-of-the-mill, image classification project, I think. But I wanted to do something with deep learning, so that's what I chose. I think I wanted to do something else at first, but I did not have the time. I only had two weeks to turn in the project. I'm a very slow developer. That's another issue. [chuckles] (26:40)

Zoomcamp project deployment

Alexey: Another question we have is “Did you deploy your final projects in AWS?” What was your experience with this? Did you actually deploy them in the ML Zoomcamp? (27:16)

Alvaro: I did deploy them, but not to AWS. I deployed them to Google Cloud, I believe. (27:29)

Alexey: Google Cloud, okay. (27:33)

Alvaro: Yeah, I used Google Cloud, both in my postgraduate course and then in the Zoomcamp, because I was already familiar with it and we were working with Google Cloud because they have a very generous credit when you start. 300 dollars or something. Yeah, it's very, very good. (27:34)

Alexey: It's really enough for three months. Too bad it's just for three months, right? I could use them for way longer – like for a year. (27:53)

Alvaro: I mean, you can do “my email account number one, my email account number two, my email account number three,” and then keep doing new accounts. [chuckles] So you can get those credits all the time. But yeah, it's not… I don't feel good doing that. (28:03)

Alexey: So it wasn't AWS and it wasn't a problem for you? Right? You deployed everything. (28:16)

Alvaro: No, it wasn’t a problem. We did use AWS during the course, because we used them to deploy TensorFlow, embedded in lambda and that stuff. The languages. (28:22)

Alexey: So you used it as well? Or you did this in Google Cloud? (28:38)

Alvaro: No. I did use it for the course. I followed the AWS steps for the course. But then my final project was in Google Cloud. (28:42)

How to not undersell yourself during interviews

Alexey: Interesting. So how did you solve this problem of not being able to sell yourself? The projects you just described look amazing to me. But then at the same time, there were people that had way cooler projects? (28:52)

Alvaro: I mean, yeah. I remember when we had to do peer reviewing, I looked at some of the projects that I was assigned and I was like, “Wow! This is blowing my mind. It’s amazing.” And then some of them were like, “Yeah, this is fine. This is a good project.” But some of them were like, way ahead of the rest. Your question was how did I solve my issue of underselling myself? (29:07)

Alexey: Yes. Or did you even solve it? (29:31)

Alvaro: I still haven't solved it, I think. [laughs] But I'm working on it. I think I'm better at it. I'm still with my coach, and we recently did some exercises and he told me “You've made some progress with this. At the beginning, you would have crashed right away at the beginning of the interview. But right now you are standing your ground.” He commended me for it. So I'm still working on it, but I'm getting better. (29:33)

Alvaro: Essentially, you explain what you can do, but you don't try to make yourself humble nor prideful. So you’re just trying to state things in a more neutral way as possible and do not belittle yourself under any comment that the interviewer is telling you. That's essentially it. (29:33)

Alexey: So it’s basically doing the same thing you just did when you described your projects, right? You said what the data set you used was, what the problem you were solving was, what kind of tools you used for solving this problem, right? [Alvaro agrees] You just give objective facts about the project without saying, “Oh, maybe this wasn't the best project I've seen.” You don’t mention that, right? (30:30)

Alvaro: Correct. I would totally have said that. “This is not the best project. But you know…” I just did that, when I told you about the projects for the Zoomcamp – I undersold myself by telling you that it was a very run-of-the-mill project. Which I think it is, because it's just a very simple image classification problem. But at the same time, in an actual work interview, you shouldn't say that. You just say, “This is an image classification problem. I used these tools to solve it. And this is the task to solve.” And that's it. You let the interviewer make her own opinion about the project, rather than offering your opinion about it. (30:56)

Alvaro’s experience with interviews during his transition

Alexey: Were the projects of any use, actually, when you had interviews? Did anyone care about your portfolio? Did they ask you about the projects? (31:38)

Alvaro: Actually, they didn't, [chuckles] which was very interesting. They were more interested in the technical exercises. I've talked about four interviews – four processes. The first one, which was just an interview that didn't go well. Then the middle two, which involved doing homework of sorts – they gave me some questions and I had to turn them in a week later. And the final interview, which also involved technical questions, but they were very simple and I got hired with them. For the last three – I was kind of surprised at the last interview because I thought that the questions they asked me were very simple. But they needed people and they hired me. (31:47)

Alvaro: The middle two processes, where I had a week to solve the problems they gave me – one of them was a bunch of hard questions and they told me, “You can look them up if you want to, but keep in mind that we'll do a follow-up interview and we'll ask you more questions, and you won't have the chance to look them up. So do whatever you feel like.” I was like, “Okay, sure.” So I tried to solve them all without looking anything up other than my specific notes that I had for my previous courses. I did have to look up a few things on Google, but then I just started them again, covering the Google results up. And that's essentially what I did. But then, that second interview never came. So I don't know what it could have been like. (31:47)

Alvaro: The second one was a project, essentially. They gave me a weather dataset – a time series, essentially – and they asked me to figure out how to predict weather in an approximate manner. It was a fun exercise, because I had never done anything with time series before. So it was fun. And that's it. For the process with the time series, that's the one company that actually made me an offer, but I rejected because I had a better offer from the other company. (31:47)

Alexey: Did any of the interviewers care about your knowledge in cloud? Did they ask you about the cloud? Or it was just these questions that you mentioned? (34:01)

Alvaro: The first one (the one that went badly) I believe that the interviewer asked me some cloud questions, but I honestly don't remember what they were. And in the last interview, they asked me if I was familiar with some of them, like, “Are you familiar with Azure?” And I said, “I haven't worked with Azure. But I have worked with Google Cloud on AWS before. Essentially, they're all similar. They just have their own specific quirks to each platform. But that's it.” I don't remember if there were any more specific questions, so no. But [cross-talk] (34:14)

Alexey: So they just asked if you know cloud and you said “yes,” and they were satisfied with that. (34:50)

Alvaro: Yeah, which is why I was surprised when I got the offer for the last interview because I thought “Is that it? You’re not asking anything else?” (34:55)

Alvaro’s Zoomcamp notes

Alexey: That’s how I also do it, to be honest. I don't know if it makes sense to go into the specifics of cloud. If you worked with a cloud, then okay. We have a comment. It's not a question but just a comment. “I am in the current cohort of the ML Zoomcamp. I always go back to Alvaro’s notes after finishing a week and revise what I learned in a week. Great notes.” I do fully agree. Maybe, in a way, your notes are like projects. You also mentioned that you were preparing for the hard questions – you were referring to your notes. I wanted to ask you – tell us your secret. How did you make these notes so good? What was your process? (35:02)

Alvaro: Man… You probably won’t like what I'm going to say, but I'm actually not very happy with my notes. Because… (35:50)

Alexey: [chuckles] Underselling? (35:58)

Alvaro: No, no, no. This is actually not underselling. Let me tell you why. I think they are too wordy. I think they're not notes. I think they're more like a literary work, in a way. The way I approached them is almost like I'm trying to write a book, which is not what I intended to do, actually. (35:59)

Alexey: You maybe can convert them to a book. [chuckles] (36:18)

Alvaro: I don't know – I've never written a book before. But the thing is, notes are supposed to be something that you can refer to. It's like an extension of your brain, right? You don't remember a specific thing, but you know where to look for it and you go and look there. Then you remember everything. (36:21)

Alvaro: But in my latest notes, especially for the Data Engineering Zoomcamp, if you look at those – they are way too long in the sense that it's actually hard to go look for specific stuff in those notes because they are so long. I actually had to create indexes for each lesson because they were so long. Honestly, I'm not happy with them. I mean, I am happy with the content. I think I did a good job formatting them. (36:21)

Alvaro: I actually had a lot of fun doing them and I learned a lot. But they do not accomplish what they are supposed to do, which is being an easy and quick reference to go look for stuff. (36:21)

Alexey: Like a second brain, right? (37:18)

Alvaro: Yeah. It's actually something I'm very interested in but have not managed to explore. I know there are these tools like Obsidian and ReNote and all these amazing tools that are out there, which are used for scientific research and everything. I’d really like to dig into those but I honestly haven't had the time to learn how to approach it because they are a little complicated at first. There are a few concepts like settled casting and that sort of stuff if you want to look into it. But yeah, I just started writing in Markdown without any specific linking between sections. (37:19)

Alvaro: Then, with any gaps I could find like “This specific concept is such and such,” then I would add links to external pages if I thought they were meaningful in any way or could help me or someone else. But that's it. As for the process and how it went the way I work is – I actually really enjoy looking at online videos because you can pause them. I suck at listening and taking notes at the same time. A video is great because you can pause it at any time. You can listen for a little bit, then pause it and write down why you just listened to. If you didn’t catch something… [cross-talk] (37:19)

Alexey: You have like two screens – on one screen you have the video and on another screen, you have your editor. Then you listen and you type. Right? (38:29)

Alvaro: That’s it. I watch a little bit of the video, then I pause and then I go to my editing window (VS code) and I just write there. If there's something more visual or anything that I don't feel confident doing a diagram for, I just do a screenshot and then I copy/paste it in the notes – I just add a link to the picture and that's it. That's how I work, essentially. (38:39)

Alvaro: So yeah, listen a bit, pause, write down, go back then maybe think a little bit if something's not clear, watch it a couple more times. This is why it takes me so long to write the notes, especially in the data engineering Zoomcamp. It was a very dense course and there was so much content to go through. Luckily, I was not working, so I could spend all day long writing notes. [laughs] (38:39)

Alexey: [cross-talk] That's your secret to the process, right? [chuckles] Be unemployed and just write down everything on the video. You said that you also did quite some research. If something is maybe not explained at all or not explained well, you would link like some other resource that explains this thing. It's not just notes from the course, but also notes with some extra things from you. (39:33)

Alvaro: Yeah, because I wanted to understand why we were doing some things in a specific way. Sometimes I would have to expand the explanations in the videos with my own research, so that's what I did. However, in the Machine Learning Zoomcamp, at the beginning, my notes were very short because they were more like actual notes. It was basic stuff that I already knew, so I didn't feel the need to write notes for those things. I actually split the content of the notes between the actual notes and GitHub gists, which are – if you are not aware of those, for anyone who's listening in – it's this part of GitHub that allows you to put actual snippets of code and stuff like that. But they're not actually attached to any repo. (40:05)

Alvaro: It's just there and I thought, “I can do some cheat sheets and stuff like that.” And I would split writing notes between the gists and the notes. Some of the gists are good. I use the GitHub gists and the Conda gists. I used them all the time – those gists are great for me. But others I did – I did another one for Python, which I never use because it sucks. It's faster and better for me to just look up something for Python or pandas or SciPy or anything. (40:05)

Alexey: You mean look up on Google, right? (41:28)

Alvaro: Yeah. Looking it up in Google is actually faster than actually looking at the gist.  My way of writing notes keeps evolving. It was a continuous experiment, in a way. (41:30)

Alexey: Was it actually useful for you, personally? We as the community do appreciate it, because it helps to go back to them after the videos. But for you, personally – looking back now, was it useful for you? (41:43)

Alvaro: Oh, yeah. They were very useful. They are not super useful in the specific way that notes are supposed to be, which are, as a quick reference to things. However, the process of building those notes was super helpful for me because when you actually write down stuff and you have to think of how to explain those things, it actually helps the ideas and the content to remain in your memory. In that regard, they were extremely helpful. Yeah. Keep in mind that if you ask me for anything specific from the notes, I will probably not remember because my memory is not that great, but I will go “Oh, yeah. This was written in that part. So I can go check it out online and go look it up.” (41:58)

Alexey: Actually, you know what? When somebody asks questions now about ML Zoomcamp and I think, “OK, in which video was it?” and then I will go to your notes. [chuckles] (42:43)

Alvaro: [laughs] It's actually something that I do too. Videos are great for explaining stuff, but if you want to look something up quickly, they're not so good. You have to scrub through the video to actually find the point where people are talking about something specific. People started adding those timestamps in the videos, which are super helpful. But it's still faster to view some reading material. It's better if you look at the video first, and then once you read it, then you can have some written reference, which is much faster than going back to the video. (42:55)

Alvaro’s coach

Alexey: There is a comment, “Why do you think you're not a good communicator? I'm listening to you now and you're great.” [chuckles] (43:33)

Alvaro: Thank you. (43:42)

Alexey: I guess the coach also helped as well. [Alvaro laughs] Do you think having a coach in a situation like yours is important? Would you be able to do this without a coach? (43:33)

Alvaro: I might. Yeah. I think it was good. I could afford it and I thought, “Why not? It's going to be a faster way than doing it by myself.” But probably you can do it on your own. Maybe you can find some self-help books or anything that takes you. I actually have a couple that I can recommend that I should read. I still haven't had the time – how to talk to people, how to negotiate, and stuff like that. (43:55)

Alvaro: We actually have some pending exercises with my coach, actually, on how to negotiate for a pay raise, which we haven't done yet [chuckles]. That's going to be critical for my future if I want to earn more money. [laughs] But is it important or mandatory? No, I don't think so. It all depends on how you are and what resources you've got. It's on a case-by-case basis, I believe. (43:55)

Alexey: There is also a question about the coach, “Did the coach (your specific coach in this case) give any input on dealing with the hiring process, or was it more about soft skills and stuff like that?” (44:50)

Alvaro: Sorry, can you repeat the question again? [cross-talk] (45:09)

Alexey: Yeah, maybe I'll rephrase it in the way that I understood it. “Was the coach specifically helping you with hiring for a data science position, or was it more like a communication coach to help you with soft skills?” (45:11)

Alvaro: It was both. He did help me to actually target... He knows that I'm an aspiring data scientist and that I wanted to work in machine learning. But the coach does not know anything about machine learning. Everything he knows about it, he's learned from me or by some other of his clients. He knows computer stuff – he studied computer engineering as well. He's been working in this kind of field for so long that he hasn't kept up with this specific field. (45:28)

Alvaro: Most technical interviews are going to be essentially similar. The technical stuff changes between them, obviously, because they're different fields but the way you approach people and the way you talk to people and how you should prepare for them is very similar, I believe, between all of them. At least I'm not aware that you should approach an interview in a different manner, depending on what kind of field you're working on. Honestly, I don't think so. (45:28)

Alvaro: We did work on specifics in machine learning in the way that I would tell the coach what I think is interesting about machine learning and what I think the potential is and stuff. He told me “Oh, then in that case, if this is your opinion, then this is how you should explain it to the interviewer, because that will give the interviewer a better view of who you are.” That sort of stuff. But the specifics of interviewing are just generic, I think. (45:28)

Alexey: So it was more about how to approach the hiring manager interview, around how to answer behavioral interview questions. [Alvaro agrees] But not how to answer “What's the difference between random forest and XGBoost?” (47:16)

Alvaro: Yeah. The coach is unable to help me with that because he doesn't know that. That falls on me. [chuckles] I'm the one who's supposed to know that. (47:29)

The importance of mathematical knowledge to a transition into ML

Alexey: Do you think knowing these mathematical or implementation details is important for the interview process – for passing the interview process for a data science position? (47:39)

Alvaro: It depends on what your role is, what you're being hired for, and obviously, the kind of person who interviews you. For instance, just today I got an email from one of the people that’s working right now on my project, and this guy did an analysis. He's been testing different neural networks for about two months right now. We gave him our model and he took a look at it and analyzed it. He essentially deconstructed and reconstructed it and he gave us all the specifics of the neural network. That's amazing. It's going to be very helpful for our project. So knowing the ins and outs – the very specifics of machine learning or the implementation side of things – can be useful, for instance, in this specific aspect. But on the kinds of things I do right now, which I'm more of a managing role, essentially. (47:51)

Alvaro: I also have to do some technical stuff, but honestly, much less than I was expecting. That is not so important because I'm working with people who already know how to do so. So it all depends, I think, on what you want to do and what you're being hired for. Sometimes, I think a very hard part of this is you are not exactly quite sure what you're being hired for or what the person who's interviewing you expects from you. That can be a handicap. But yeah, it can be useful to know these things. (47:51)

Preparing for technical interviews

Alexey: I actually wanted to talk about what you do at work right now, but we have so many questions. We can just go with the questions. The most upvoted question we have is “How to prepare for technical questions in an interview other than portfolio projects?” Taking notes or what? (49:32)

Alvaro: Honestly, whatever works for you, man. [laughs] Sorry, I think I saturated the mic. Preparing for technical questions is such a hard thing to do because you cannot know what you're going to be asked unless you know what you're being hired for. For instance, if the company that's going to hire you works in NLP projects, then it makes sense to study NLP specifically. What's NLP? What does it entail? What kind of tasks do you solve? How do you approach those tasks? Whatever. (49:55)

Alvaro: But sometimes you don't really know what to do. It all depends what kind of technique you are most comfortable with when studying – taking notes from videos, that's fine. Reading your past notes, which is what I did. Doing some exercises, building up your portfolio is also good, because not only do you have something to show, but those skills are also being ingrained into your brain when you carry out those projects. So it depends on how you want to approach those. Whatever works for you, honestly. (49:55)

Alexey: Even though they didn't specifically ask you about these projects, it was still useful for you to master the skills, right? (51:00)

Alvaro: Right. (51:06)

Alexey: But I guess it's pretty unusual. Maybe they just didn't ask you about them but they did look at them. Everything is public on your GitHub. Maybe they just went through this and just didn’t ask. [cross-talk] (51:08)

Alvaro: Yeah, they did ask me some things. In the company I work for, actually, the two people who hired me are not working for the company anymore, which is kind of funny. But I was asked a few technical questions. They were just very simple, like, “If you have this kind of task, how would you solve it?” And I would do a very high level approach to the task. That sort of thing. But nothing super specific and there were no coding exercises in this particular instance. In those previous processes, I had to turn in code. So it all depends. (51:20)

Alvaro’s typical workday

Alexey: Can you tell us what your typical work day looks like? (51:53)

Alvaro: Sure. I read e- mail [laughs] A bit of it. And I also check out Teams, when my computer decides to run it because holy crap – Teams is such a hog. (51:58)

Alexey: You're on a Mac, right? (52:13)

Alvaro: Right now I'm on a Mac, but my work computer is a PC. (52:14)

Alexey: So it’s Windows. And it doesn’t run Microsoft Teams? (52:19)

Alvaro: It only had (only!) eight gigs of RAM, which I thought was not enough. Then I recently upgraded to 24 gigs of RAM, which is amazing. Right now I think I can finally run Teams properly. [chuckles] But yeah, my usual day. Right now I'm in charge of a project. What I do is, I've set up a way of managing it. I've got an Excel sheet. I've also got Microsoft Planner, which is a Trello-like thing that is included with Office. It also integrates well with Teams, so it's good because you're in your specific team and then there's a tab with the tasks there, which is great. (52:22)

Alvaro: Right now there's only two of us, it's only me and some other guys, so we divide what we want to do. We work from home and each one of us works on a specific thing and then at the end of the day or halfway through the day, we update each other on what we've done. And that's it. For instance, back in April/May that was the zenith of the project – the point with the most development going on. There were five of us and each one was coding different stuff. We had to agree on what had to be done, and then we had to agree on the tasks. So that's essentially it. (52:22)

Alvaro: It's more of a project management kind of thing than actual machine learning. There's less of us right now – there's only two people – and we'll have a third person coming soon, but not yet. So I also have to do some more technical stuff. (52:22)

Alvaro’s team’s tech stack

Alexey: What kind of tech stack do you use? (54:12)

Alvaro: Right. We are on Azure. Well, more like our client works with Azure, so we make use of the resources of the client. All of our code is in Python. We use Python and also Keras, which is a library (if you don’t know) where you set up your dataset, you set up a task, and AutoKeras will look for the best model for you. So it's like everything is done for you pretty much automatically. You do have to fine tune things, but essentially, that's what we did. (54:15)

Alvaro: Then we run those scripts – those Python scripts are being run on Databricks cluster, and we orchestrate those scripts with something called Azure Data Factory, where you design your DAX – which is essentially your work. You create your workflows on Azure Data Factory, so when something's triggered, then it calls a specific script, which then calls another script and that sort of stuff. Everything's written to a SQL database. Pronounced like “sequel”. Sorry, I'm used to saying the spelling. It's supposed to be SQL (sequel) in English. (54:15)

Alexey: Is it? Like sequel? I think both work, right? I hear both options. (55:28)

Alvaro: Okay, then whatever. SQL or sequel, whatever. [chuckles] Yeah, Microsoft SQL database on Azure. That's where all the data is being kept. It’s a stack that the developers were more familiar with, so that's what we went with. The project was supposed to be delivered in September and we had to deliver our first prototype in June, which was a shock. We had to speed up the process by three months. Right now we're extending it. We're adding more new stuff. But yeah, it's been a challenge. [chuckles] (55:35)

The importance of a technical background to transitioning into ML

Alexey: For somebody who wants to go through the same process as you – somebody who doesn’t necessarily come from a very technical background. I don't know how technical your background was, but I guess… forgive me if I'm wrong, but my interpretation of what you did is that it does not require a lot of technical expertise. Sorry, I don't mean to offend you. (56:11)

Alvaro: It did require some but I know what you mean. Please go ahead. (56:37)

Alexey: The question is – for somebody who wants to follow your path, maybe not necessarily from a less technical background or a more technical background, how do they do this? What would your recommendation be in this case? Would you suggest they just leave and spend two years being unemployed and take courses, and take notes? [chuckles] (56:44)

Alvaro: That's what works for me, but it all depends on what works for you. I do have a technical background because I did computer engineering and science whatever. I studied that, even though my actual work was not as technical. I did have to know how to do some stuff. I had to read all these Google documents, which explained what the homologation process was and what the requirements were. I had to know what kind of modifications were not supposed to be done on the phone in order to pass those tests. So it did require some knowledge about how Android is layered and that sort of stuff. But I don't believe that you need a technical background in order to work in this field, but it helps a lot. I mean, it all depends on what kind of role you want to do. (57:05)

Alvaro: If you want to be an actual data scientist, like working in most of the high-level stuff, like experimenting with models and such, it helps a lot to have a mathematical background, because that's all very theoretical, in a way. There's lots of research going on and lots of new stuff is coming out all the time. So it really helps a lot if you have a mathematical background there. Actually, in my company, my colleague right now, he's a mathematician – he studied mathematics. There are quite a few mathematicians in the company. (57:05)

Alvaro: However, on the other side, if you want to do the actual data engineering part – if you want to actually do things like being very good at Spark, or you want to be very good at Kafka, or you want to be very good at any of the tools that are being used in machine learning workflows – then you don't need such a technical background, because it's all about knowing the tool and learning how to use it. I believe that you can come from any particular background and work on that. (57:05)

Alvaro: However, it helps if you have some technical background because then you already know all the high concepts and when you have to change to another tool – sure, you have to learn new stuff but it won't come in a weird manner to you. If you are learning Docker, you can use Docker. Fine, and now you have to use Kubernetes. Okay, then I'm learning Kubernetes. But if you know what a Linux kernel is and how it relates to the underlying utilization, or sandboxing parts of stuff, it helps to understand why some things work the way they do. But they're not entirely necessary. So on that part of things, I think that you don't need a technical background for it. So it depends on what you want to do. (57:05)

Alexey: The way I understood you – you already knew how to program, right? [Alvaro agrees] This is something you studied in university with you computer science degree but [cross-talk] (59:51)

Alvaro: I did lots of Java. Which… Java sucks. Sorry. [laughs] (1:00:00)

Alexey: Yeah, but it's quite easy to pick up Python after any programming language. So if you have experience with any programming language, be it Java or C++ or whatever – JavaScript – then just starting Python should be easy. For you, I guess, that was the case. (1:00:04)

Alvaro: Python was really easy. There are some quirks to Python, like the way you do iterations in loops, which is quite unique, I believe. But programming in Python is a joy. It’s very, very easy. (1:00:26)

Alvaro: Alvaro’s CV (1:00:26)

Alexey: Okay. There is also a request if people can have a look at your CV. Maybe you can have a stripped down version. Maybe not right now. If you want to show it right now, okay. It will be an exclusive. We will not include this in the audio version, so maybe you can also send us a link that we can include to the audio-only version. (1:00:40)

Alvaro: Sure. Let me copy the CV and post it to a Dropbox folder. (1:01:05)

Alexey: Okay, that's awesome. (1:01:17)

Alvaro: [Discussing sending the file with Alexey] (1:01:20)

Alvaro: It's not the best CV. There are lots of ways you can do a CV. I just went to Canva – it's a website for graphic design. I found a template I liked and I modified it to my needs. That’s it. (1:01:56)

Alexey: Okay, the link is in the live chat and I will also put it in the description. If you watch it in replay, you will see this. And I think that's it for today. I want to thank you for joining us today, for sharing your experience, for telling us about your career journey. Thanks for being here. And thanks, everyone, for asking your questions. (1:02:11)

Alvaro: Thank you very much, everyone. And thank you, Alexey, for having me. (1:02:35)

Alexey: Have a great weekend! (1:02:39)

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.