MLOps Zoomcamp: Free MLOps course. Register here!

DataTalks.Club

From Academia to Data Analytics and Engineering

Season 8, episode 9 of the DataTalks.Club podcast with Gloria Quiceno

Links:

Did you like this episode? Check other episodes of the podcast, and register for new events.

Transcript

Alexey: So this week we'll talk about becoming a data engineer – a data engineer or analyst – and we have a special guest today, Gloria. Gloria works as a business data analyst at ICE. I don't know if I pronounced the name correctly. Is it ICE? Like ice cube and ice cream. Right. Before working at ICE, she was a researcher and at some point, she decided to change her career, which is when she enrolled in a data science bootcamp. She will probably tell us more about that today. So, Gloria, welcome to our event. (1:58)

Gloria: Many thanks for having me. (2:31)

Gloria’s background

Alexey: Before we go into today’s main topic, let's start with your background. Can you tell us about your career journey so far? (2:34)

Gloria: Yeah. I originally studied biology and psychology in the United States and then decided to continue that track by pursuing a Master's in neuroscience here in Europe. After my Master's, I decided to join a research group here in Germany, in another city in Magdeburg. I worked in that laboratory for three years. The research I concentrated on was examining the effects of dopamine on long term potentiation, or how memory is formed, essentially. (2:41)

Gloria: Throughout my time there, I did realize that after a while, I did have more of a passion for the technical stuff rather than the theoretical, scientific, or biological background of my research. Also, while I was there, in my last year, I had to establish a technique and I had to build a lot of my own apparatuses of reconfigure machinery, and one of the things also was to learn how to program in the C language, so that I could do some data analysis. And that part, I really enjoyed very much and I realized that this is what I really wanted to do in the long term and that in academic research fields, this was not like so much a possibility for me. I wanted to go to a field where I have more possibilities, so I decided at the end of my third year contract, to not renew it, and then to start doing the career transition. (2:41)

Gloria: I eventually found this data science boot camp here in Berlin called Spiced Academy and I did that for three months. I really did enjoy it. I learned so much more than I had anticipated. Then I did the typical job hunt – you know, how everyone does – and eventually landed my spot at the company ICE as a business data analyst. (2:41)

Alexey: Interesting story. You said you needed to configure this machinery and tried something in the C language. What did this machinery look like? Is it something that you attach to the brain or? Not the brain, but your head I mean. [chuckles] (4:35)

Gloria: No. so I actually… Oh, no, no. [chuckles] I mean, I've done that before as well. But this one is essentially about – I had to work with animal tissue samples. I essentially had to harvest brain slices, unfortunately, from mice. And then I would have to place it in this massive microscope apparatus and hook up this little tissue sample to a bunch of different electrodes and wires. Then, using the controllers, I kind of tried to control the voltage and current that was being passed through the tissue, conduct my proper recordings, and then collect all that data and analyze it. (4:47)

Alexey: You needed to use C for doing this? For controlling the voltage and all that stuff. (5:25)

Gloria: Well, no. The programs that were used to actually conduct the experiments and to do the data collection, I think they were probably based on some kind of C language, but this was all software that’s not open source. It's essentially software the laboratory already buys and it's configured in the apparatuses and everything like that. (5:31)

Gloria: The stuff that I had to do in the program was use this on a scientific program called Igor. It's not very well known in the open source community. But in the scientific community, it's pretty well known. That program has a flavor of C – it's not exactly C but it's very, very close to C. For that program, I had a copy of it on my laptop, and it was just the program that I knew I could easily import the data that I recorded to do the analysis. So I found this tutorial book on Amazon, ordered it, read through it and practiced, and then gave it a go with building my own little GUI and analyzing the data. (5:31)

Working with MATLAB, R, C, Python, and SQL

Alexey: With your background, was it difficult for you to learn to use Igor and its language? (6:33)

Gloria: Actually, I was surprised when I first learned a little bit about programming as an undergrad back when I was in the United States. I was working with MATLAB and I thought it was the most horrible thing ever. I was so petrified by it. Every time I had a course or any kind of class assignment to do with it, I just froze. It was impossible. But when I picked up a little bit of programming again back in Magdeburg, the other city I worked at, I just gave it a go. And I actually enjoyed it quite a bit. Because prior to doing that, or in parallel, I was taking this Coursera course to learn R. With R I actually felt that it was a little more difficult to pick up. It's not as simple as Python or C or anything like that. So when I had to work with C and this program, Igor – it takes quite a bit of practice, a lot of trial and error, as anyone who does programming knows – but if you're really, really passionate about it and you are really willing to put the hours in to practice and stuff, it becomes quite innate and normal. (6:42)

Working at ICE

Alexey: I know that MATLAB can be quite frightening, especially if you don't have any programming background. I also needed to use MATLAB at University for doing some signal processing. And I thought, “Okay, why are we being tortured with this tool? There are other tools that may be better.” So what do you do now at ICE? (7:46)

Gloria: At the moment – when I first started, I was hired as a business data analyst and I was hired to essentially help my coworkers work on what we call ‘tickets’. Essentially, both internal and external customers come to us, because we have the SQL capacity and access to the data warehouses in order to compile the reports that they need to then do their more high level analysis. I did that for the first couple of months when I started at ICE and it was more about learning. I mean I knew a little bit of SQL when I started, but it's really easy to pick up, you get the hang of it after one or two weeks. But it was more the business model that was actually quite complex. (8:10)

Gloria: Just things like, “Where can I find all the information? What parameters will I have to apply in the SQL statements in order to get the correct information?” Especially for business separation, all this data is not separated by customers in different tables. They're all just meshed together in the respective tables and I had to try to make sure that I wasn't providing information of one customer to another external customer. Yeah, I found this quite difficult at first. Also taking scripts that my coworkers previously wrote and then just rerunning them. And if a customer asked for some kind of alteration, I would spend the day just reading and understanding the script and then finding out how I am supposed to modify this script to be able to deliver results to the customer. (8:10)

Gloria: So that's what I did when I first started. There are also side projects where we do collaborations with other departments. Usually when I work on those types of projects, it's because I have the program capacity and just the access to the data warehouse to then provide the data to the more business-oriented analysts – the ones that have a greater business background in terms of this specific data model. I work with them to clean up the data, fetch it and they also ask me for my interpretation. So I give it a go, try to get some insights, do some exploratory work, and then we all work together to try to come up with a story for the data – for this specific project. (8:10)

Alexey: So, you need to prepare reports, you need to fetch all the data in a form that is easy to put in the report. Do you also take care of the reports themselves, or somebody else uses the data to visually arrange them? (10:30)

Gloria: I usually don't really know what happens to the reports after I submit them to the customer. But from what I understand, all these customers don't really have any kind of programming capacity. So what they essentially do is a lot of analysis and look at the data in Excel. I guess they would create their basic graphics and other things they would need in order to present this data to some key stakeholder. For instance, if they needed a dashboard, we would be the ones that would create the dashboard for them – they would not do that. So if they were to look at the data themselves, it's mostly just Excel. (10:45)

Alexey: So what you do is more or less: there are some data pipelines and you need to get data from a database that you say is a little bit complicated. In these pipelines, there is data from different customers, which you need to carefully get from there, clean it, prepare it, and pass it over to somebody who then will analyze this data in Excel. Is that correct? (11:26)

Gloria: Yes, yeah. (11:50)

Alexey: I'm just curious, what does the company actually do? You said it's something related to music, right? (11:53)

Gloria: Yeah, music processing. There are two sides of the business that the company works in – in copyright and online data processing. For the copyright side, they're essentially the data warehouse for all the music metadata from societies. For example, in Germany, there's the GAMMA society, and in the UK, there's PRS. In Sweden there’s STIM, and then in the United States, there's BMG. (12:01)

Gloria: Essentially, different publishers and societies feed loads of data to us in terms of copyright information – the names of the songs, the creators, what rights they have to each work, how much percentage of royalty they should be receiving, etc. The copyright side of the business takes care of all that. The online side of the business essentially takes reports from streaming services, ingest them and processes them. Then they fetch the copyright data and calculate how much royalty should be paid out to each respective creator, society, or publisher. (12:01)

Alexey: Okay. So somebody gets a song, they stream it to the listeners, and then you need to figure out how many listeners listened to the song so that you can calculate how much they should be paid? Right? (13:05)

Gloria: Yes. Actually, when the streaming services send us reports, they essentially just state how many times a given work was streamed so we take that quantity. But oftentimes the reports are not fully clean, which means that we sometimes need to find ways to identify the work so we can match them to the copyright database. After they do the matching, then they can determine what the percentages are in terms of mechanical rights, performing rights, different types of types of rights, and how much a given artist should be receiving based on the stream count. (13:22)

Alexey: And you do this part of the process, right? So you receive some data from somebody, but you need to clean this, you need to also match these things that are sometimes difficult to match. Right? (14:01)

Gloria: Actually, there's a couple hundred people that are involved in that full data pipeline [chuckles] ingesting the copyright data and making sure that it's correct. There are some people that actually have to contact the different societies and publishers and ask them to confirm that the metadata is correct. There's also people in the online department that make sure that the Oracle data warehouse is correctly ingesting the data and then they do some kind of inspection and see if the streaming services provided the correct report, or if they provided too many missing values. They have to go back and forth with the streaming service and tell them “You must send us a new report, one that’s more complete.” Then my job is essentially to fetch the data in any given time point in the pipeline based on what the customer requests. If they want to know that, “Oh, what was the state of the data at this time point A?” Then I should try to fetch it that way. (14:16)

Gloria: Also, if they receive any kind of discrepancy in the reports, in terms of how much people are being paid, we have to try to investigate as to why this occurs. When these types of investigations happen, for instance, we provide the present state of the data and then we have to communicate to operation departments that actually deal with the ingestion in their day-to-day work and ask them to say, “Oh, what happened here?” and then they try to explain it to us after investigating within their entire department. So essentially, we're kind of the midpoint between the internal departments that do the data ingestion and the customers, and provide the correct reports for them. We do various types of reports, whether they are just excel files that they can look at on their computer, or we build custom dashboards for them and then provide refreshes, so then they always have the most up-to-date data. (14:16)

Job hunting after the bootcamp

Alexey: Yeah, interesting. Can you tell us how you got this job? You graduated from a bootcamp and then, I imagine, it wasn't easy to get this job. So how did you do this? (16:14)

Gloria: Oh, yeah. That was quite a journey. So I finished the bootcamp, I think, at the beginning of August in 2020. I actually didn't get a job offer until December of 2020, so about four and a half months. (16:26)

Alexey: You started this already after the pandemic, right? So I guess you applied to the bootcamp before the pandemic and then graduated after, right? (16:41)

Gloria: Yeah, well actually… [cross-talk] (16:52)

Alexey: It wasn’t the best time to look for a job, right? (16:55)

Gloria: Yeah. I mean, if I anticipated that the pandemic was coming about it would make the job search much easier. But hey, you know, it's life. It happens. I think I found out about bootcamp sometime around April of 2020. Then I finally got the funding and started it in May of 2020 and then it finished at the beginning of August of 2020. Then I had to do the job plans, send out a ton of applications, get constant feedback on CVs and cover letters and stuff like that. (16:57)

Alexey: So it took you 4-5 months, right? Quite some time. (17:37)

Gloria: Yeah, it was quite some time. When I talked to other fellow graduates, or other people that were also in the same boat as me, either doing career transitions or trying to break into the domain, they also had quite a bit of a hard time applying and sending out CVs and doing those take-home tests or those take-home challenges and stuff like that. (17:39)

Alexey: Yeah, so how did it go for you? You graduated in August and then you found the job in December. So what did you do in these four or five months that actually eventually led to the job? (18:05)

Gloria: In parallel to applying for all these jobs, going on LinkedIn, Googling around and stuff like that, I also volunteered for Omdena, doing one of their projects. I chose to do that as kind of a CV filler, but also to keep me occupied and keep me on my toes in terms of coding and understanding machine learning, and all that kind of stuff. (18:21)

Gloria: It was also just a good way to provide a distraction from the worries of always having to look for work and stuff. That was a really good experience. I learned a lot from other people and how they see coding, the solutions they came up with, the brainstorming, the collaboration and stuff like that. So yeah, that's how I also filled up my time during those three months. (18:21)

Alexey: It's “Omdena.” Right? (19:15)

Gloria: Omdena, yes. (19:17)

Data engineering vs Data science

Alexey: I remember you mentioned that last time we talked. I think it was before you found a job. I remember you were actually asking how to figure out if you’re more interested in engineering. But I think you did this bootcamp in data science, but then you realized you're more interested in engineering. (19:18)

Alexey: Maybe you can tell us a bit more about that. How did you realize that you're more interested in engineering rather than data science? And how did this affect your job search? (19:18)

Gloria: I remember at the time, I thought it was quite cool to be able to build pipelines or to build platforms or to build dashboards. I also found that in the real world, it's not like a data science bootcamp where the data is served to you on a silver platter and all you have to do is just analyze it. There's actually quite a lot of work that goes into it and that typically involves a little bit more data engineering and fixing up the data. So when I was searching for jobs, I was actually searching for both data engineering and data analytics. I would say that whichever one worked out first, that’s the one I would go for. (19:52)

Gloria: It turns out with this position that I eventually got hired for, it is titled “business data analysts,” but honestly, I feel like I do both – both data analytics and data engineering. With our team, we're in the process of trying to automate as many of these customer reports as possible so then we could do more advanced data analytics, predictive modeling, and stuff like that. A lot of this is on automating and reducing our workload, instead of just being SQL monkeys. This actually takes quite a bit of engineering, understanding the data warehouses, understanding how to optimize the SQL queries and integrating these SQL queries into either R or Python scripts, or even using Docker to push it to AWS and have it run at a fixed frequency. (19:52)

Using Docker

Alexey: And you learned all that at ICE already, right? Or did you already know a bit of that? I think you mentioned that you learned a bit of Docker at Spiced, right? (21:25)

Gloria: I first learned Docker at Spiced. One of the projects that we worked on was Docker-based, with a Twitter bot and stuff like that. Once I finished my time at Spiced, I didn't really touch on again. Then at the end of last year, we've found in the team that we could work better, more collaboratively, and much easier if we used Docker containers. For instance, there was one handover project I had from a colleague where he wrote all his scripts in R but in his local machine. (21:35)

Gloria: We had different versions of R libraries in the external machines. And then when it came to me, I had to test the scripts and see if they run. I couldn't execute any of them just because I didn't have the appropriate libraries and installations. When I tried to figure out which ones to do, I had to downgrade so many libraries. And they said, “No, this is not a proper solution.” So we essentially decided to rebuild the scripts in Docker containers and images to verify that they will always work. And now that's what we do with the project – anything we want to run remotely on AWS, whether it be long scripts or automated reports, we first build them in Docker and then just push them to AWS for the developer to then start the job or the frequency. (21:35)

Keeping track of job applications, employers and questions

Alexey: Coming back to your job search – do you remember how many applications you needed to send in order to eventually get the job? (22:57)

Gloria: Oof. It was painful. I actually had to count at the end of the year. I think it was around 130. (23:06)

Alexey: One hundred and thirty! (23:12)

Gloria: Yeah, yeah. I remember, I counted them up at the end of the year. (23:14)

Alexey: How did you keep track? Did you put this in an Excel spreadsheet? (23:17)

Gloria: Actually, I eventually had to do that. Since you have to send your application in PDF format and stuff like that, I kept every job description and every application that I submitted in PDF on my computer. Then over the course of two or three days, I had to go back to this directory in my computer and count how many job applications I sent and it was one around 130ish. (23:21)

Alexey: Why did you need to create a folder for each application? (23:49)

Gloria: Oh, no. For each application, I had one PDF, which is the job description and the second PDF was the actual application that I sent, so the cover letter, CV and whatever form they wanted me to fill out. (23:54)

Alexey: Okay, so you kept this for your own analysis afterwards? (24:11)

Gloria: Yes, yes. It was just for my own records and for consistency. In case I need to reference back to anything, it's right there. But it's not like they're all in little folders on my desktop – they were sub folders within a folder. [chuckles] (24:18)

Alexey: [laughs] Desktop, yeah. What was it helpful to keep it organized in such a way? Usually when I look for a job, I just… Well, last time I wasn't. I was selective, but a few jobs before that, the strategy I used was spray and pray – just click as many ‘apply’ buttons as possible without really keeping track of where I applied. Then I would sometimes get invited to talk to a recruiter and I thought, “Okay, what does this company do?” They would ask me questions like “Tell us what you know about us.” And I thought [chuckles] “Hmm. What do I know about you? Nothing?” [laughs] So yeah, I was going crazy. Did you keep track of these things to actually stay organized and to understand where you applied and keep track of what the company is doing? (24:35)

Gloria: Yeah, I actually did. For applications, if I was really interested in a job post and I was fully intent on actually applying for it, I would save the job description as a PDF, because they take jobs down just as quickly as they post them. So it was just a personal record. It was also to see, if I was later in the interview process and they mentioned something about requirements like, “Oh, we offer these benefits and this is our salary range,” and stuff like that, I would go back to the job description to see, “Is this still the case or did they change it quite a bit?” (25:24)

Alexey: Aha! That’s smart. (26:02)

Gloria: Yeah, because you never know. That can happen as well. I also kept my application to see, “Okay, what did I actually say? What did I actually write in my original application?” in case they mention it again in the interview. It'd be good to be consistent, at least in their eyes. (26:04)

Alexey: After you had the interviews, did you also keep track of what the questions were, what you answered to these questions? Or was it more lightweight? (26:21)

Gloria: Not at first, but then I started realizing that I should. So I did that quite a bit. I didn't create a spreadsheet for every single interview, but I did create just one massive doc of questions that I felt stood out, or those that I felt were either very challenging or very good. Then I would write them down. I didn't exactly write down how I actually responded, but rather what my ideal response would be – the aftermath. When you’re not nervous or anything like that. (26:33)

Alexey: Yeah. So you know the question and then you come back to this question when you're not stressed, you just sit down and write your ideal answer. That probably helps you get this question right. (27:01)

Gloria: Yeah, exactly. I guess there's only so many finite questions companies can ask you. It's also a good way to practice. For instance, if I had an interview that I knew was coming up, I would take my long list of collected questions and just go through them one by one. Yeah, that helped a lot. (27:16)

Alexey: By the way, from these 130 applications, do you remember how many first interviews you had? (27:37)

Gloria: I want to say between 10 and 13. (27:45)

Alexey: So it's like a 10% ‘conversion rate’ or ‘success rate’? (27:49)

Gloria: Yes. (27:54)

Challenges during the job search and transition

Alexey: I see. What do you think was the most difficult thing during this process for you? [cross-talk] …maybe or something else? (27:55)

Gloria: I'm sorry? (28:05)

Alexey: Was it live coding perhaps, or maybe something else? I know many people find it very difficult to do live coding. So I'm wondering, was this the case for you? (28:07)

Gloria: Oh, yeah. I mean, I only had one live coding session and that one I thought was quite difficult. It's just that I'm a nervous interviewer and I don't do very well under pressure, especially if someone is looking at my screen. The thing that sucked about this one is that when they scheduled the interview with me, they didn't actually explain to me what would occur during it. I thought it would be like a multi-interview process where you would first meet the team and then possibly meet HR afterwards and stuff like that. But they smushed it all into one and that surprised me. I think in the last half an hour of the interview, they said “Oh, by the way. We're going to be doing a live coding challenge.” I was like, “Oh, that's so cheap. That's so mean.” (28:16)

Gloria: It did not go very well, because I was quite bad at Python writing class back then and I had to do that live. At the time, it was very difficult. I was not able to do the live code and the data engineer also wanted to see how I did Googling and stuff like that. I was quite nervous about that. I think about 10 minutes after the interview, I was able to write the whole class and I sent it to them like, “Oh, here. Just in case. If you want to consider me.” But it's so surprising how long it takes for you to do some kind of live coding when you're under pressure versus when you're just on your own figuring stuff out. (28:16)

Alexey: You said you couldn't do this live, but then after the interview ended, you just did this because you didn't have any pressure and then you sent it to them. And then what happened after that? (29:48)

Gloria: In the end, they sent me an email like one or two weeks later saying “Oh, we felt so honored that there were so many applicants that applied. There's so many nice applicants.” You know – all the fluffy stuff. (30:01)

Alexey: Basically a rejection letter, right? (30:16)

Concerns over data privacy

Gloria: Yeah. The letter said “Unfortunately, we're not going to continue with your application. But if you allow us, (because of the GDPR and data protection laws and stuff like that), we would like to keep your personal data on file.” In the end, I decided that I'd like to exercise my rights to data privacy. Should I want to apply for any position there, the ad will just pop up again, and I will just reapply. (30:19)

Alexey: So you asked them to delete your data? (30:48)

Gloria: Yes, yes. (30:51)

Alexey: Okay. And this way you can apply again, and they don’t know. Well, theoretically. If they follow the process. (30:52)

Gloria: Yeah, that is true. Yeah. Technically, if they follow the process they wouldn't know. But I mean, there's also all these companies that say, “Oh, we would like to keep your data just in case of anything.” Do companies actually go through their applicant pool to look for people? For me, I thought there's no point in approving this, especially in this day and age with all the data privacy problems. If I can avoid it, I would rather keep my data to myself. (31:00)

Challenges with salary negotiation

Alexey: I think I primed you a little bit when I asked what the most difficult thing was. Maybe you were going to say something else – not live coding? (31:29)

Gloria: I actually have to give it a good think. I would say there were many difficult things in different stages of the whole process. At the beginning, it was creating a good CV and a cover letter. That was hard at the beginning, but once I got into good practice with that, it was okay. The one thing that was difficult, and something that I still have trouble with, would be the salary negotiation. I think that this is the most difficult because I find that that's a key determinant as to whether they even consider you as an applicant and how far along you go in the application process. I feel it's quite a necessary skill set to have – to know your worth and to really insist on it. (31:36)

Alexey: But how do you gain that skill? Especially if you're fresh out of a bootcamp and you do not have such insights into the market? How do you know that? (32:26)

Gloria: I guess when you first start doing the job search and many companies ask you for your asking salary, you just have to do a lot of Googling and research. For instance, I think LinkedIn offers some stats, especially in the city that you reside in, regarding how much you should ideally be making. They're not very clear stats because they don't consider your experience level very much, so it's quite ambiguous. I also suggest talking to different people. The people from the bootcamp I did offered me quite a bit of career coaching advice and that helped a lot with understanding what kind of asking salary I should have. (32:39)

Alexey: I'm just wondering, if you're looking for your first job, would it be a good strategy to say, “I'm ready to take whatever salary you offer me. Just hire me,” or no? (33:21)

Gloria: Absolutely not. (33:37)

Alexey: Okay. [chuckles] Why not? (33:38)

Gloria: Absolutely not. Because, of course, they're going to give you the lowest salary they can. Companies are going to try to save as much as possible – they want to hire the most experienced person for the least amount of money. I've also heard that your asking salary is actually a reflection of how confident you feel of your skills. (33:39)

Alexey: Or of how well you know the market, right? Maybe you don’t know how much is normal. (34:16)

Gloria: Yeah, that is true. I think that's a skill set in itself – knowing the market well enough to ask for the right salary. (34:19)

The importance of career coaching and support

Alexey: Yeah, I imagine that could be tough even for people with experience already. Although with people that have experience and those who already have a job, they can just say “Okay, what I'm making now – let's add 10% to that. That would probably be a good baseline.” But if you don't have a job then “Okay, how much should I ask that’s too much not too low?” For you, you said it was helpful to have a career coach at Spiced who helped you with these numbers, right? (34:27)

Gloria: Yes, having a career coach and also just having friends or colleagues from the bootcamp. People that are more veterans to the industry, know how to negotiate salaries, and know how to do a self evaluation for salary negotiation and stuff like that. I also consulted with them a lot and that helped a lot. So networking is actually quite helpful as well. (34:55)

Alexey: Can you share any tips when it comes to salary negotiation? Maybe something that your friends shared with you that you found helpful? (35:19)

Gloria: I'm trying to think back. I guess for the position that I got, in the end, I did ask us slightly below the market a little bit – 5K below – by accident. I typically asked around the average market, but by accident for that application, I put in a little less. And I guess that's one of the reasons why they took me as well. I don't know, in terms of tips, I would say my friends, and the career coach as well, they told me to always ask for a little more and then you negotiate your way down. I also have a friend that told me that she waited to hear what their salary offer was, and then she always added one or two thousand more. She said, “They're not going to turn down my application over just one or two thousand euros.” (35:29)

Skills learned at Spiced

Alexey: Yeah, thanks for sharing the tips. So, you studied data science, but you ended up not doing data science. Was what you learned at Spiced actually useful for you? (36:20)

Gloria: Yeah, it really was. For instance, programming alone – I had never worked with Python prior to going into Spiced. Of course, now I work a lot with Python and SQL. A lot of the data engineering tools that I learned at Spiced, I also use on my job as well, or will use. Docker for instance, I use that a lot now. I also learned a little bit about Airflow while I was in Spiced and my team is interested in trying to incorporate Airflow into everyday activities. So, even though I don't do data science at the moment, there are plans for a couple of members in our team to start doing more predictive modeling – so more on the machine learning side. A lot of the things I learned at Spiced were definitely very helpful for my job. (36:33)

Retrospective on Gloria’s transition to data and advice

Alexey: Let's say you were to go back in time and look for a job again, but with remembering all the experience that you have right now – what would you do differently now? (37:25)

Gloria: I would definitely say getting career coaching a little earlier in the game. I remember I did apply to a couple of jobs and had one or two interviews prior to attending the boot camp at Spiced and prior to getting through career coaching. Retrospectively looking back, I see that I made a ton of mistakes when I was doing that interview. Definitely about salary expectations – I would definitely do that much better. Let's see, what else? (37:46)

Gloria: In the end, we can't control everything and the markets are constantly changing, but I would definitely like to do even more networking. Though during the pandemic, it's quite difficult to do a lot of networking since everything was remote. You could only meet people via video or chat or something. Maybe I would also do the career transition even earlier, like back in 2015 or something like that, when you yourself did it. It must have been very different back then. (37:46)

Alexey: Yeah, it was different. I had just one interview and after that they hired me. [laughs] Sorry. It's like the complete opposite of the experience you had. (38:54)

Gloria: [chuckles] Yeah, it was. (39:02)

Alexey: I literally had just one interview and that was it. It wasn't even a series of interviews. Right now companies torture applicants through like four or five interviews. But for me, it just was just one. First, the future Team Lead reached out to me on LinkedIn, we had a short call and then I went to the office on the next day. Then a couple of days after that, they gave me an offer. (39:04)

Gloria: Oh, nice. (39:30)

Alexey: Yeah, I don't think my experience of career transitioning is still valid. [chuckles] I don’t think it works that way anymore. (39:31)

Gloria: [chuckles] Oh, yeah. I think it's quite different now. (39:39)

Alexey: Yeah. So that's why as I said at the beginning, when people invite me to talk about career transitioning, I usually say “Yeah… things were different back then.” That's why I really like inviting people who did their transition recently, so they have a more fresh perspective on things. Because now it's more difficult. Now there are all these boot camps, the demand is maybe higher, but also, there are more people on the market. So it's very difficult now. (39:42)

Gloria: Yeah. Have you interviewed people that had the transition more recently, like within the past year or six months? (40:16)

Alexey: Well, maybe not. [chuckles] (40:25)

Gloria: I mean, I wonder what it is now. Is it even worse? (40:27)

Alexey: But still, for you, it took like five months. I think this is a pretty long time. And it was difficult. I hope it's easier now than it used to be. (40:34)

Gloria: I hope so too. (40:45)

Alexey: Okay. You said you made a ton of mistakes and that you needed to talk to career coaches who pointed out these mistakes. One of these mistakes was salary negotiations. Do you remember what kind of other mistakes you’ve made? (40:47)

Gloria: Other mistakes. I'm trying to think. Maybe it's something I said incorrectly in an interview – like they were expecting one answer and I gave them another. I think also a couple of the tech challenges that I did for another data engineering position. I think they didn't like some of the techniques that I used in my code and they felt like it was essentially the same thing over and over again. I felt like the instructions of the take-home challenge that they gave me were quite ambiguous. In the email exchange, they made a lot of references like, “Well, we haven't stated this. We haven't stated that.” And I felt like the hiring manager wasn’t very clear in communicating what they wanted. (41:03)

Gloria: So I guess, retrospectively, I would have spoken out and said, “No, this is not what you said. This is not how I interpreted it.” Of course, at the time, you're a little nervous and you're a little too humble, so you tend to stay quiet. But looking back now, I feel like I have a little more power because I have a job and stuff. I can be like, “No, that's not what you said.” So I can stand up for myself. (41:03)

Top skills that helped Gloria get the job

Alexey: Yeah. Interesting. Thanks for sharing that. I see that we have some questions. So one of the questions is from Vadim, who asks “What are the top skills or tech stack that you believe helped you to get this role?” (42:22)

Gloria: Definitely knowing a bit of programming skills – both Python and R. I did put that I knew SQL, even though I had only done it for like two weeks in the bootcamp. They liked it a lot that I went to bootcamp. I had experience with using machine learning and stuff like that. So they definitely hired me for that. I felt like one thing that they really liked about me in the interview was how I recognize the importance of clean data and doing data quality checks. Because if you don't check your data and whether it’s clean or not, it really skews the results in your reports. I see it from day-to-day work that it's really necessary for us to do that. So I think that was the one of the things that stood out. (42:38)

Alexey: Can you tell us a bit more about these checks for data quality? Was it a take-home test that you did and then you went the extra mile and added some checks or how did it happen? (43:22)

Gloria: For this particular position it was just SQL challenges where they hired another company to provide me with an SQL test – like a live SQL session. I think I had like an hour to answer three or four SQL challenges. But actually, this data cleaning thing came more in the interview process. I think in terms of skills, everyone goes to boot camps now or does these online courses and they essentially have kind of the same skill sets. I guess what they look for in interviews or what stands out, is if the person recognizes what's essential on the job. Because a lot of companies think – and it is kind of true – that you can learn things on the job. (43:37)

Gloria: Whatever technical skills you're missing, you can learn it on the job. For instance, if you know how to provide insights, if you know how to detect anomalies, if you know how to work, what to do, when the data is just not clean and not organized and you need to do quite a bit of fixing it up, and what steps you take and stuff like that. I would say maybe they consider this a little more important. (43:37)

Alexey: And you did learn quite a few things on the job, rights? Things that you didn't know about, or maybe you just worked with them a little bit, like SQL – you said that you spent only two weeks learning SQL, but now it's pretty much all that you do all day, right? Maybe not that much, but like 50% at least. So you definitely can learn all that. (44:43)

Gloria: Yeah, exactly. I didn't know this before I started working on my job, but there are different flavors of SQL. Each data warehouse has their own format of SQL. For instance, one SQL query that I'll execute in one data warehouse, for example Redshift, sometimes won't work in an Oracle-based data warehouse, so I'll have to customize the queries in that way. So yes, I just learned to recognize that. (45:04)

Thoughts on cloud platforms

Alexey: Another question from Bala, “Which cloud platform did you learn? Did you learn any cloud platform in the bootcamp?” (45:29)

Gloria: Yes. At Spiced we did AWS, just because it's free for the first year. It's actually quite cheap to continue learning even after the first year is done. I did a little bit of Google Cloud Platform, but not as much. But I think in terms of cloud platforms, AWS is the most requested. (45:42)

Alexey: When you learned Google Cloud Platform was it already after the bootcamp? (46:06)

Gloria: Yeah, it was. I was actually exploring how to do some kind of dashboards in R at the time. I think there was this one Medium article that I found, which explained that it was possible on Google Cloud Platform. So for one weekend, I just sat down and gave it a go over there. (46:11)

Alexey: The good thing about Google Cloud Platform is when you sign up, you get $300 and you can do pretty much everything with this $300, while in AWS, there are only some services that you can use under the Free Tier. So with Google Cloud Platform, you have more freedom, but it's also only for three months. (46:31)

Gloria: Yeah. But I also actually remember from Google Cloud Platform, when I was trying out some stuff – those credits actually go by quite fast. If you don't know what you're doing, if you let things run all night, or you don't pay attention to what you're executing – those credits can go by pretty fast. (46:53)

Alexey: But at least it's not your $300. (47:10)

Gloria: Yeah, that's true. (47:12)

Alexey: That's good. Then, in the worst case, if the $300 is gone then you can just maybe create a new account. But if you have your $300 and they're gone – then you don't have them anymore. [laughs] (47:13)

Gloria: Quite painful. [laughs] (47:30)

Thoughts on bootcamps and courses

Alexey: Let's say you needed to go through the same career transitioning again. You work in academia doing research, your contract is over, and you want to go to programming. Would you go through a data science bootcamp again, or you would have done things differently? (47:33)

Gloria: In my personal opinion, from what I remember when I was looking for jobs in the market, a lot of companies equated online courses and data science boot camp as the same. They actually didn't really see a difference. I personally think I preferred the data science bootcamp and I would do it again, just because I had a couple different teachers to talk to, I could customize projects, and just enjoyed interacting and working with other students. I think this is invaluable and I would definitely do it again. (47:54)

Alexey: Would you maybe consider doing a data engineering bootcamp? I think there is at least one, maybe two in Berlin. There’s not as many as data science boot camps, but there are data engineering bootcamps as well. What do you maybe now consider doing this kind of bootcamp, or would you still go with the data science one? (48:30)

Gloria: I mean, that's hard to say. Because at the time, I think when I was doing the data science bootcamp, they tried to fit a little bit of everything – a little bit of data analytics, data engineering, and data science. (48:48)

Alexey: It sounded like that when you described it. You were doing Docker, you were doing machine learning, you were doing Airflow there. But data science is also a little bit of everything. I'd say you need to know, to some extent, a lot of things. (48:57)

Gloria: Yeah, I think it depends on the person. At the time, I think it was good for me because I wasn't sure what part of the data science field I wanted to get into – whether it be analytics, predictive modeling, or engineering. So it was good to go to a bootcamp that had a little bit of everything. But I guess if you're a person that is dead-set on something, like, “I definitely want to do data engineering,” then I guess those bootcamps would be great for you. So I guess they kind of serve your niche a little more. (49:16)

Alexey: I guess not everyone has the luxury of knowing exactly what they want to do after academia, right? In academia you understood that you wanted to do coding, but what exactly does it mean to “do coding”? Right? (49:47)

Gloria: Yeah, exactly. Even though I knew I didn't want to work in academia anymore, and that I wanted to do more analytics and “eccentric things,” but I also wasn't 100% sure exactly what it was that I wanted to do. (50:02)

Spiced graduation project

Alexey: Can you maybe tell us a bit about the projects that you did at Spiced? At the end, I know that you had some individual projects where you needed to work by yourself on some things. I think you did something related to Twitter analytics, right? (50:15)

Gloria: Yeah. It was essentially learning how to build a data pipeline using Docker to fetch data from a Twitter API based on whatever hashtags or parameters you're interested in. Then coding – one of those bots on Slack would provide you with the Tweet information once every hour, or once every minute, or something like that. (50:30)

Alexey: Was it actually a project that you needed to do yourself – your individual project – or something that everyone needed to do? (50:58)

Gloria: It was one that everyone did. They outlined the project in a custom build and then they provided us with lessons every day on how to do a different part of the project until we delivered it on Friday. (51:04)

Alexey: So it took one week to build this pipeline? Was it a lesson about building pipelines? (51:17)

Gloria: Yeah. It was essentially learning to build Docker containers. One to just collect the tweets, a second one to clean them up, and then a third one to push the Tweet information to Slack every hour. (51:24)

Standing out in a sea of applicants

Alexey: I'm wondering. Let's say somebody graduates from bootcamp – at DataTalks.Club we also call them zoomcamps, but they're similar, like workshops. People finish these courses with a bunch of projects that they did during the course, and also individually as a part of the course. How do you actually “sell” this to future employers? How do you say, “Okay, this is the thing I did and I’m valuable because I know how to do these things.” So how do you actually show that you can do these things using these projects? Do they even believe you? Because if you do this as a part of the course, maybe you just followed the instructor and that's all, right? (51:42)

Gloria: Yeah. Yes, it is quite tricky. I mean, they always tell you, “Oh, you have to build a portfolio full of projects to show at your job application,” but then also when the employers see that these projects are basically from a bootcamp or from an online course, they probably see the same projects over and over again. I guess they do get quite tired of it. I would say maybe individualized personal projects would be better. Or if you custom-made a project yourself and show why you started the project, why you chose that topic, why you are interested in doing this particular project – I think this stands out a little more. (52:31)

Gloria: For one of the data engineering jobs I applied for – I didn't get it in the end – but one of the reasons why they chose my application to continue in the application process was because they found that the individual project at the end of the bootcamp that I did was something pertaining to sustainability. They thought that was interesting and they saw that I was interested in sustainability, so that's why they chose me for the application process. (52:31)

Alexey: What was the project about? Can you tell us about it? (53:34)

Gloria: It was essentially where I was trying to build a data pipeline where I would fetch Tweets that were associated with sustainability hashtags. Then I would clean up the data and I was in the process of trying to build a dashboard to show different stats and different features and different insights about the Tweets that were being Tweeted at the time. (53:37)

Gloria: One of the things I never got to pick up again (or I just haven't taken the time to finish it up) but I found that there's actually quite a bit of Tweet bots on data and that really skews results. For instance, I did a little bit of sentiment analysis on the Tweets and the Tweet bots skew the sentiment analysis very much. Yeah, that's another lesson learned – you need to implement ways to clean this activity out so you don't get them in your final results. (53:37)

Alexey: Probably there are not so many of them – maybe a handful that are most active. You can maybe just remove all them from the analysis. At least this is my experience with Twitter bots. There are just a few that are most annoying – the most visible ones. (54:33)

Gloria: Yeah, but I think the problem is just knowing to identify them, especially if you just have tons and tons of data in your model. (54:50)

Alexey: Right. You don't know which of them are causing problems. (54:58)

Gloria: Exactly, yeah. It's very difficult to tell. I know there are algorithms out there that can already be implemented in your pipeline which will auto-detect them – they'll help you flag which ones are actually Tweet bots. I know one at an academic research lab in the States, they consistently develop this algorithm to be able to determine the Tweet bots and they do Twitter analysis themselves for social sciences. (55:01)

Alexey: Why was this employer interested in that? Is this something they also do as a company? (55:26)

Gloria: No, it was just because the title of my project was “Current Opinion on Sustainability”. I'm not sure if they actually dug into my project. In the interview process, they didn't ask me too many detailed questions about it. They just asked me things like, “Give a brief summary on this project.” And that was it. I think they were more interested in the fact that I was also interested in their mission because they were a sustainability company. In other words I was someone that took a personal interest in that topic. (55:32)

Alexey: Yeah, I heard a similar story, albeit a little bit different, which was also about personal interests. So one person got a job interview because they mentioned meditation as their interest. The person who was doing the screening was also into meditation so that's why they invited them and they talked a bit about that at the beginning. So you never know what will work out at the end. So sustainability could be your interest and meditation – it could be anything. But I guess this is not easy to predict. When you have a pile of applications, who do you select to interview, right? (56:06)

Gloria: Yeah, that is very true. (56:46)

The cohorts at Spiced

Alexey: We have another question from Chris. So Chris is asking “What was your cohort like at Spiced?” Were they all at similar career stages as you? People with PhD, Master’s, etc.? Or did they have different backgrounds?” (56:48)

Gloria: Oh, we all had different backgrounds, but it was also quite nice. They were also in the transition phases of their lives. So even though we came from different backgrounds, we related to each other so much. We actually exchanged a lot of our personal career stories. There were a couple of former project managers, there were two people that were on the verge of finishing their PhDs, and there were others that used to work in consulting for many different domains. I've heard that there are some cohorts where they have either more or less diversity in terms of different backgrounds. I think either the cohort before or two cohorts before me were mostly academics trying to get into machine learning and stuff like that. But yeah, every cohort is different. (57:06)

Alexey: Which background was the most interesting to you? Do you remember? (57:57)

Gloria: I think the one that had quite a big transitional difference. It was one of my colleagues that has a background in literature and was doing his PhD in the classics but he wanted to get into data engineering and coding. I thought I felt like that was a huge jump. Yeah. But he was super good. Anytime there was a bug I couldn't solve, I always went to him and said, “Oh, can you please help me out? I got this bug.” And he was like, “Oh, yeah. Let me take care of it for you.” (58:03)

Alexey: Yeah, interesting. A couple of episodes ago, I invited Jessica, who was working as a barista and went into coding. That's also quite an interesting change – from barista to coding. Okay, I think we should be wrapping up. Do you want to say anything before we finish today? (58:37)

Conclusion

Gloria: No, I think your questions essentially covered all of them. It was really cool doing this. This is the first time in a long time that I've been asked about this whole transition phase in my life. It was nice reminiscing about all these things. (59:02)

Alexey: I think one year – or maybe one year is a bit too long – like six months to one year after somebody already quit a job is a good time to ask how they did the transition. Not that I waited on purpose for one year before contacting you. But then I realized, “Okay, maybe I should talk to Gloria and ask her about that.” Yeah. Thanks a lot for joining us today, for sharing your experience, for talking about all the difficulties you had in your job search process. Thanks, also to everyone for attending today and for asking questions. I think that would be it for today. Thanks again. (59:16)

Gloria: Thanks so much. (59:54)

Alexey: Yeah, have a great weekend. (59:57)

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.