MLOps Zoomcamp: Free MLOps course. Register here!

DataTalks.Club

Data Science Manager vs Data Science Expert

Season 6, episode 3 of the DataTalks.Club podcast with Barbara Sobkowiak

Did you like this episode? Check other episodes of the podcast, and register for new events.

Transcript

Alexey: This week, we'll talk about two different roles. We'll talk about the data science manager role and the data science expert role. We'll also talk about how they are not the same and why they are different. There are probably some similarities there as well. We will cover that this week. We have a special guest today, Barbara. Barbara is a data scientist by profession. You’re a data science manager, right? (1:29)

Barbara: Yeah, exactly. (1:58)

Barbara’s background

Alexey: Welcome, Barbara. Before we go into our main topic, let's start with your background. Can you tell us about your career journey so far? (2:00)

Barbara: Yes, of course. I graduated with a degree in GIS (geographic information systems) from the University in Poznan. I am a specialist in spatial analysis. At the time, I learned how to work with very specific data, which are special data – they are different. After graduation, I worked in telecommunication components in a Java specialist position. Although I loved this job, after a few years, I decided to come back to Poznan and find a new job and change something. So, I became a SQL Developer and I learned very well how to work with databases and reports. Quite quickly, it stood out that my strong points are analytics skills and establishing contact with clients. (2:28)

Barbara: I continued my career as a business intelligence analyst, being responsible for the analytic part of the data warehouse construction and reporting projects. I also prepared some visualizations and data analysis. At the same time, I also became one of the leaders in our BI department, which is how I also developed my leadership skills. But I always found that we could do something more with data – some advanced analytics. My goal was to gain more insight from data. One day, we decided to build our own data science team. After some time, I became a data science manager and I got the opportunity to build my own team and take responsibility for the whole data science area. Now, along with my team, we are building machine learning solutions. This is the very short story of my career journey.

Alexey: What kind of solutions are you building? (4:23)

Barbara: Now, we have two main topics. One is a model that does predictions for people who are suffering from bipolar disease or depression, whether their faces change or not. This is important for doctors. The other solution is a small commercial project that involves demand forecasting and recommendation, so – inventory optimization. (4:26)

Do you need a manager or an expert?

Alexey: That's quite interesting. Thanks for sharing. I've been following you on LinkedIn for quite a while. I think you post regularly – maybe every week or every two weeks I see posts from you. Then some time ago, maybe a few months ago, you shared a post that started with “Do you need a data science manager or a data science expert?” Can you tell us more about this post? (4:58)

Barbara: I remember this post very well. This post was about the differences between data science managers and data science experts. These two positions are often confused between one another, even in the data science world. Often, people forget many aspects of managers’ tasks. I often get job offers from various headhunters on LinkedIn, and the job titles are “data science manager OR data science team leader”. But when I analyzed the requirements, I could see that 80% of them were very technical. For example, Python programming, machine learning solutions, deployment, Kubernetes, Docker, advanced libraries, such as TensorFlow, PyTorch, etc. (5:36)

Barbara: Only 20% of the requirements had anything to do with soft skills that are typical for managers, such as team management, stakeholder expectation management, communication, building relations and strategies for the company. Very often people confuse a data science manager with data science experts. I know that data science is a young field and people don't know what the requirements should be for the particular position, and they don't know what kind of person they need in their companies. So I want to spread awareness and save time in the process of building a data science team in organizations.

Alexey: Who do you think creates such job listings? Is it coming from management who might not necessarily yet know what they actually need from a manager? Who usually creates these posts? Do you have an idea? (7:13)

Barbara: I think that it’s usually created by some guys from IT – Head of IT, Head of BI, or somebody like this – and he or she doesn't know exactly how data science works. They don’t know what is more important for which position. Or sometimes HR creates the job offers and they don't have enough knowledge about these positions, “Data science. Okay, technical requirements. OK. Management – one point or maybe two points.” (7:28)

Technical and non-technical requirements for managers

Alexey: What do you think the balance should actually be? Say somebody is looking for a manager – should it be 80% technical, and 20% non-technical? Or should it be different? (8:11)

Barbara: It should definitely be different. I think maybe half and half, or something along those lines. But I think the most important are the managerial skills. (8:22)

Alexey: The next question I have in my list was, “What motivated you to write the post?” I guess you answered that – many headhunters reached out to you saying, “Hey, we have this amazing job as a data science manager!” You open this job, and then you see “Okay, this doesn't sound like a managerial role.” (8:36)

Barbara: Yeah, it happens. (8:59)

Alexey: Do you see such job descriptions very often these days? (9:01)

Barbara: Very often. I see this on LinkedIn and on different job websites. It happens very often. (9:06)

Alexey: But do you think the reason this happens is because these people want to find a manager and just don't know what requirements to put? Or do these people actually want to hire an expert and they don't need a manager? (9:18)

Barbara: I think it happens because people need or want to have a data science team and they don't know how it works. They think that “Okay, we need a manager – we need a team leader. Okay, this person is a data scientist with extremely strong knowledge of technical issues plus team management.” This is not a good way to do that. This would be something like Head of IT. A data science manager is something different from a senior web developer or expert in other programming areas. (9:31)

Importance of technical skills for managers

Alexey: As I remember from your background story, you were saying that you became a data science manager. When you became a data science manager, how much experience in data science did you have back then? (10:14)

Barbara: It depends on how you define ‘data science’. (10:37)

Alexey: I would define it as knowing all these things – like machine learning and these other technical things that you mentioned – do you think it's important for a manager to know these things? (10:39)

Barbara: Okay. So, before I was a manager, I worked for around five or six years. This is why I asked what data science means to you, because I worked for 3 years as a GIS specialist and it is a very specific part of analytics. But if you say data science in general, not only machine learning analyst, but GIS as well – I worked with databases and analyzed some data, prepared visualization, etc. So I think before I became a leader, I worked something like five or six years general with data. (10:51)

Alexey: Yeah. But I wanted to ask you – is it really important to get to know all these technical things that you mentioned, like Python, Docker, Kubernetes, machine learning – all these technical things? Is it really important for a manager to know all that in order to be able to be a good manager? (11:40)

Barbara: It is important to understand how it works but on a high level. Maybe I shouldn't say this, but I don't use Docker. But I know how it works and this is the most important thing. I can talk with my team and we can discuss it because I understand how it works. But I don't actually use it. If you have a technical background and you can understand how different parts or the technology works – that’s what’s most important. It is impossible to know everything. Managers need to have wide knowledge, but it will be quite shallow as a result. This is the difference. Experts need a very deep understanding – very deep knowledge. While managers need wide knowledge, it will be a little bit shallower compared to an expert. (12:02)

Responsibilities and skills of a manager

Alexey: Yeah, interesting. So experts need deep knowledge and managers need shallow knowledge. They just need to know a bit of everything and know what to ask for and then just to go to an expert when they need deeper knowledge. Let's talk a bit about data science managers. What is a data science manager? What are their responsibilities and which skills do they need to have? (13:04)

Barbara: Data science managers help organizations use data to make the best management decision. Of course, he or she has to know what skill set and personality is needed for team members. Because a team means there are people. People are not robots. We need people to feel good together and work well together. Managers need to know who they need and who will fit into the organization and the data team. (13:29)

Barbara: They also direct the personal development of each team member. Data science managers also show the directions of data science team development and prepare strategies for using AI and machine learning technologies in the organization. That's why we follow trends in AI and need a wide knowledge of techniques in data engineering, data analysis, data visualization, model buildings and deployment, ML ops, etc.

Barbara: Managers have a holistic approach to data science projects. They don't only know and use technologies and models, but they also communicate with stakeholders and business users. They understand all the requirements of the machine learning models very well and how it will be used by business users. With this holistic approach, they should be able to identify new opportunities to leverage data science in various business areas.

Barbara: Of course, like every manager, they should be able to manage problems, for example, team members, business expectations, give results for models etc. For that is useful to strong critical thinking and problem-solving capabilities. This all combines soft and technical skills.

Importance of technical background for managers

Alexey: How hands-on is it? Does a manager actually sit and code? Do they open a Jupyter Notebook and tune XGBoost or not? Or not so much? (15:34)

Barbara: [laughs] Sometimes. It always depends on the structure and maturity of the organization and the data team. It's also connected with the level of technical skills and the background that the person has. So if he or she has trained models before, and if not, it depends on his or her determination to learn how to do it right. Some managers don't have time to write code and train models. Others do, but spend a maximum 50-60% of their time on it. But building models is not the core of our duties. It's kind of ironic that our tools are PowerPoint and Excel – we do a lot of presentations. (15:49)

Barbara: But in my opinion, a data science manager should be able to understand technical issues very well, analyze results from models, and give some tips on the direction in which the team could go. It's important to pose the right questions to the team and the business stakeholders. Because it's common for data scientists to enter rabbit holes and spend months on a model without any progress. A good manager can notice this and pull them out of the hole at the right moment and say, “This is good enough” or show them another direction. This is much easier if the manager has a good technical background.

Alexey: You mentioned if the manager has previously trained machine learning models. Do you think that it's actually a requirement that they have this experience of training a model? Or could they be good managers without this experience? (17:21)

Barbara: It is possible to be a good manager without this experience. It's harder, I think. But it's also possible. It also depends on the organization. I know an organization that is quite big and I know that their head of IT (or head of data science) is very non-technical. Of course, he did some courses and has some knowledge, but he doesn't have hands-on experience. It happens, but it is possible. It depends, like always. (17:34)

Alexey: When you were saying what kind of things managers do – it would be thinking of the team’s development, thinking of each team member’s personal development, preparing strategies, having a holistic approach and communicating with stakeholders, identifying opportunities, managing expectations, and critical thinking. None of these seem to require writing actual code, right? So people can probably be good managers without necessarily having a deep technical background in the past, right? (18:19)

Barbara: Right. But I think it's harder to develop a team and say which direction they should go, or ask for something when you don't have this kind of background. Because when you discuss things with your team, maybe you have some problems with models, for example – and you don't understand how it works. It's definitely harder. (19:04)

Getting involved in business development and sales

Alexey: Yeah. I already see a question related to the responsibilities of data managers. “Do you think data managers should be involved with business development and sales of projects or not?” (19:26)

Barbara: I think yes, because a manager is responsible for the strategy and how the organization will prepare different machine learning solutions and who these solutions are for, as well as in which business areas they will be applied. I think yes. I think that's why managers have more time to do something other than coding, while experts don't have enough time. (19:40)

Alexey: Yeah, because if you have all these responsibilities – even if you want to code, you just simply don't have time. (20:18)

Barbara: [laughs] Yeah. I know this very, very well and have had this problem many, many times. (20:25)

Developing the team

Alexey: Another related question to what we discussed, “How do you make sure that your team is learning new data skills, trying new tools, and seeing new applications every day? How do you do this as a manager?” (20:34)

Barbara: Okay. Well, my team is not too big, so maybe that's why it's easier. Because if you have a big team of 20-30 people or something like that, it's harder. My team is quite small, so I sit with them every week and we discuss these things. I know what they are doing on the product. Sometimes I see the code, because we have a repository. We discuss what is good and what's not. Sometimes I ask for help from a software engineer, because he has better programming skills and can advise me. After that, I can know how well they prepared the code and I see the results of the model, of course. (20:51)

Barbara: I think that it's important to tell each member which direction he or she should take and tell them what the goals for the year or for the month are. For example, if we have some courses on the website, he or she should take two or three courses or something like that. Then I see if they did it or not. I also see the result that has on the project and how well it’s working out. Because taking the course is one thing and working a commercial project and normal life is another. I look at the results and I discuss them with other team members and see whether they are good or not.

Alexey: If I can summarize – correct me if I'm wrong – with each team member, you individually work on setting goals and choosing the right directions for personal development. After that, you pick a couple of courses that should help with this particular goal. Then the person takes the course and tries to apply what they learned in their job, right? (22:43)

Barbara: Yeah, it could be a course or sometimes, if I think that it will be better for somebody to learn from other team members… (23:11)

Alexey: You pair them up, right? (23:22)

Barbara: Yeah. I put them in pairs and I know that the senior will help the regular professional. After that I discuss the progress with them both. (23:23)

Checking team’s work

Alexey: Yeah, the other question is “Do you double check the work of your team?” I think you mentioned that you take a look at the code. But the reason you do this is because you want to suggest the areas for improvement, right? It's not like you're double-checking. (23:36)

Barbara: I'm not double-checking. They have code reviews and others have different things. Yeah, I look sometimes. But I do this maybe once per four models or something like that – when I'm very curious. But I don't double-check. They're better programmers than me. [laughs] (23:54)

Alexey: Do you think managers should do this or not? Should they be technically involved in that? Should they review pull requests? If they want, they can do this? Right? (24:15)

Barbara: Yeah. If they want, they can do this, but it’s not obligatory. I also do this sometimes for my own development. I'm curious how they code and what is in the project, but this is not something I double-check. It’s like “Oh, okay. We have done another function. Yeah. Oh, this code is better. Nice option” (24:25)

Alexey: So it's more for your own personal development. (24:51)

Barbara: Yeah. I know that the project is going very well. It’s usually like “We’ll have this function in a few months.” “Oh. Nice.” (24:54)

Data science expert

Alexey: So instead of spending time reading some articles, you just go and check the code. “Oh, nice.” And grab some coffee. [laughs] Okay. We talked about the manager and we talked about their responsibilities. From the list of responsibilities, it doesn't seem super technical – there are many non-technical things. Now let's talk about the data science expert role. So, what is this role? Also when I hear the word ‘expert’ I think – are we talking about the role that comes after senior? We have this usual progression of junior professional, middle data scientist, senior, and then there is a role after the senior. So is the expert somebody who is after the senior role? Or could it be a senior just as well? (25:02)

Barbara: I could say generally, yes – experts are after seniors. Of course, the expert role is different from the manager because he or she needs a very deep understanding of algorithms and technologies from the area in which he or she has many years of experience. As a result, this person solves many problems in this particular field. Very often, an expert doesn’t only have a good knowledge of technology, but also domain knowledge. For example, if somebody prepares sales forecasting, he or she knows not only time series, prophet or GBM, but also how the company sells and what could influence this process. Or if somebody works with NLP, he or she knows how the languages work. Or if an expert has been preparing a model for logistics for 10 years, he or she has not only a deep understanding of algorithms, but also domain knowledge. Thus, this person combines these two things very well. That's why I think that an expert is very often after a senior. Not always, but very often. (25:59)

Alexey: So experts know one particular business domain quite well. They know what kind of algorithms work there, what kind of feature engineering usually works there, how to prepare the model, how to train the model and how to actually apply this model – how to deploy it. So they know all this for this particular domain, very, very, very well. Right? (27:25)

Barbara: Yeah, exactly. Exactly. That's what I explained. [laughs] (27:50)

Alexey: Yeah, right. [laughs] That sounds very different from the manager's description, right? The only kind of non-technical thing was their domain knowledge. But I think if we dig deeper, it's actually quite technical. Because it's about applying machine learning – applying data skills – to a particular domain. So it's about using their technical skills to solve business problems. I think it's still quite technical. Do you agree? (27:54)

Barbara: Yeah. Experts are very focused on the technical side of data science. That's why it's different from a manager. It’s a different side. (28:21)

Hiring experts

Alexey: Yeah. So, why might we need an expert? It's not enough to just hire somebody who is a senior or regular data scientist? (28:38)

Barbara: Okay. So, some machine learning models are easier and some tougher. Some are very complicated and you need somebody who can handle them – somebody who has good experience. This person could propose complicated and good solutions, spend hours optimizing some hyperparameters, etc. In another case, the expert could be a freelancer. You work with him or her from time to time. This person advises you or your team in which direction to go with a particular problem. An expert knows the answer to many questions from this domain. Or very often, an expert is a senior, but with many years of experience, and this person knows the organization, its problems, technology, and business processes very well. Therefore, he or she already knows what you need and can prepare analyses or models very quickly. So this is why you sometimes need an expert. (28:48)

Who should we hire first?

Alexey: Let’s come back to our previous discussion. Say a company wants to hire a data scientist and they think, “Okay, we need to hire a manager.” Then they come up with this list of requirements, 80% of which are technical. What do you think – if they want to start a team? First of all, do they need to hire a manager? Or do they need to hire an expert? What kind of skills do they need to have for this particular purpose? (30:04)

Barbara: Okay. If you are a big organization – because if you are a startup it’s different – but if you are a big organization, I think you need two people. A manager and an expert or a good CEO. If you’re lucky, you can find an expert with very, very good soft skills. The expert may have 20 years of experience and knows the technology, but he or she could also meet with business stakeholders, etc. But generally, I could say that the best way is to find a data science manager who is a senior or professional and the other member of the data science team would be very technical. I think this is the best way for big organizations. But I know that very often, small organizations like startups do it this way – they look for a senior with good communication and good soft skills. This person is usually responsible for the whole data science staff in such organizations. (30:37)

Alexey: Let's say somebody agrees with you and hires a data science manager. Then the manager would basically build a data science team. So what should be in the job description for this position? (31:56)

Barbara: I think that this person should be prepared for team building. So experience and knowledge of this field is necessary. Then they also need to have wide knowledge about AI and machine learning solutions, but not exactly in a technical way, but rather in which business areas they could use them. Also, of course, they need communication skills, writing skills – since we don’t always talk, sometimes we write emails, which is a different skill. And of course, presentation skills – how to make a good presentation about everything: results for models, but also strategy, business requirements, etc. They also need experience with management: stakeholders management, expectations management. And, of course, analytic thinking, because they should know what kind of analysis they should do and what meaning the model results have. It would also be nice if this person has a technical background or technical skills. But it's more important that this person will understand how AI or machine learning works, and not necessarily how to write the code. (32:15)

Can an expert build a team?

Alexey: Let's say some organization comes up with a job description, similar to what you saw. You’ve already seen quite a few of them. They managed to hire somebody who is an expert. But let’s say that they actually need a manager, but they hire an expert with this job description, because the job description matches the expert profile more than a manager. So the expert starts working. What can go wrong here? Is it enough just to hire somebody who knows it well and let them build a team? (34:04)

Barbara: Okay, so I see two red flags with hiring just an expert. One is team building. This is a totally different job – or task – than technical tasks. If you're working with people, you should understand how they work. You need empathy. You need communication skills to talk with others and motivate your team. You should also be able to build a team. But team, not in the sense of “people who just work together”, but rather “people who cooperate together and they want to work together.” Often, this is not easy to accomplish for experts, because they’re used to working alone or maybe with one or two people. So this is very often difficult for them. (34:42)

Barbara: The second red flag is the ability to find business areas where you could apply machine learning solutions. Although they may have good knowledge about algorithms, techniques, etc. But I'm not sure that they will find the best business areas. Therefore they could tell the stakeholders that this is the best solution, and that with this solution, the business will work better. They prepare a very good model – very good AI solution – but maybe in the end, the solution is not appropriate to this business case or maybe the company needs something different. Say they build a chatbot. Chatbots are great, but the organization needed something more, or something different. Chatbots are very cool and nice, but they actually need sales forecasting, for example.

Alexey: Basically, there should be somebody who can translate the actual business needs into a machine learning problem, right? An expert is not necessarily the right person to do that, because they might not be able to do this. Maybe they don't see the big picture or they don't know what exactly the company needs. So they just think “Ok. I’ll just go and do this chatbot and it will solve the problems.” But it doesn't always work out. Is that right? (36:54)

Barbara: Yeah. There is also one additional problem. Managers care about relations. We talk with different people from different departments. Generally, an IT solution does not only concern the technology department, but also the business department, and many other departments. Our job is also to talk with these people, care about relations, and do well to understand what is most important for them. Sometimes this is very tough for experts because their focus is on the technical side. This is okay, but that's why we need another person to fill in those gaps. (37:29)

Data science managers in startups

Alexey: I see. You also said that if it's a big corporation, they can actually hire two people – they can hire a data science manager when they start building a team and they can hire an expert as well. So they pair them together, let them do their things. But what if it's a startup? Maybe they don't have the luxury – or the money – to hire two people. (38:12)

Barbara: Yeah. Startups don't have this luxury. Very often, they don't even have the luxury to hire a data scientist and a data engineer. That's why startups look for data scientists with many skills. That's why, very often, data scientists can turn into real unicorns – because they need to have data engineering skills, data science skills, soft skills to communicate and do presentations, etc. And I think the startups should look for somebody who is focused on the startup’s domain, someone with good communication skills, and who has quite good knowledge of the technical side. Maybe not an expert level – not somebody who knows every detail of algorithms and technology – but someone who is on a level that is good enough, so that the first models will be good enough. This is maybe the best solution for startups. (38:37)

Alexey: Okay. Yeah. Interesting. Startups just need somebody who can wear many hats and are comfortable doing this. (39:49)

Barbara: Yeah, exactly. [laughs] (39:58)

Alexey: Not everyone would be interested in doing data engineering, building the actual model, and then going and selling this to business stakeholders saying, “Hey, this is my awesome model. Now you have to use them.” They’ll probably get an answer like “Um. How about no?” Right? [laughs] So yeah, that’s interesting. (39:59)

Barbara: Yeah. But in startups, you also have smaller groups. You don't have so many different departments. So you don't need to show your models or show your results to 100 different people. In a startup, you have 20? 15? Depends on the startup. (40:21)

Alexey: You show it to the CEO or CTO and say “Hey, can we implement this?” Yeah, that probably simplifies things. (40:36)

Barbara: Yeah. (40:46)

Project management

Alexey: We have quite a few questions. Most of them, I think, are related to the manager role. The first one concerns prioritization and deadline estimations, which are usually very hard for data science projects. I think you can relate to that. Is it your job to figure out how to prioritize things and how to set deadlines? What is the process for doing this? (40:47)

Barbara: Okay. For me, it's sometimes a little bit different, because I work in a company that prepares solutions for other companies. This is different from working with internal projects. But in my job, I am responsible for prioritizing tasks. When we have a lot of projects, for each project, I choose one person who is responsible for that particular project. But I always look from time to time whether the project is going okay or not. Of course, they can always ask me what they should do and what’s more important to do and we decide – or we’ll ask the clients because it’s always their decision in the end. (41:14)

Barbara: Regarding how to estimate the project, this is difficult. But I have one method for that. I don't estimate time for each task, because it's very, very difficult. Sometimes you think you have five tests, but after that, it turns out to be 20 tests. It's difficult to say, but I know how much time we have and how many people we have to do this task. I also know approximately how much time they generally need to prepare an entire model or to do tests, etc. I don't say, “Okay, for this model we need two weeks. We need two weeks for tests – so that’s one month total.” No. Instead I say “Okay – we need one month plus some buffer time. So that’s maybe six weeks. Two-three people. Okay. This is our timeline.” That’s an example. But it's extremely hard and I don't say how much time is needed for every task.

Alexey: Basically, your role also involves a bit of project management, right? You need to structure the project plan, and also monitor how the project develops, whether there are any bottlenecks and how to resolve these bottlenecks. Is that right? (43:36)

Barbara: Yes. A little bit, yes. Of course, we work with our project managers and they’re still learning how data science works, because it’s totally different from something like web development or other IT areas. That's why I sometimes help them. I also help them because all data science projects are my responsibility. In the end, when we have some problem, of course we need the project manager. But this is also my problem sometimes. That's why I help them out sometimes. (43:51)

Alexey: So the project managers can sometimes come to you and ask, “Hey, for this project, how much time do you think you need?” You need to sit down with them and figure out how much time to allocate to this chunk. Right? (44:33)

Barbara: Yeah. Also to figure out how many people we have to allocate. Of course, sometimes they ask me, “Okay, Basha – is it possible that this task could be prepared in two weeks?”Because they don't know if this is enough. Is it too long? Is it too short? I could say, “Yeah. Okay. With this guy or this guy, this is enough.” Or maybe “It's too short. We need more time. We need a buffer.” Or something along those lines. (44:46)

Alexey: Okay. Basically, when a client comes to you and they have some sort of ML/AI problem, you get together with the project manager and maybe the client themselves, and you try to plan this whole thing, right? This would be your responsibility as well – to also sit with the client and with the project manager to figure this out. (45:18)

Barbara: Yeah, exactly. Of course, the project managers are responsible for the project, but they need some knowledge for me, because they are not specialists in data science. They don't know which exact tasks will be needed in this particular solution. (45:39)

Ensuring that projects provide value

Alexey: How do you make sure that the team is improving the revenue of the company? Meaning that the projects that you're making are actually impactful and help your company and perhaps the clients to deliver value? (46:14)

Barbara: Hmm. How do I make sure? I ask many people. I talk with people. Of course, we have some metrics, but these are math metrics – metrics of models, or something like that. But I talk with clients and I ask them. I also ask my project managers, because they have other perspectives on the project. They talk with clients, they talk with our sales department, and they're not super technical guys. They see other things, which is why I also discuss it with them. (46:29)

Barbara: I think there are two ways that I can make sure that I’m moving in a good direction. I ask the clients – it doesn't matter if these are external or internal clients, because you are preparing a solution for someone. I ask them. Of course, we can prepare some dashboards. If you prepare sales forecasting, you can monitor how good it is and if it improves the service or not. But this is not easy in business. It’s not black and white, necessarily. Is it the result of the model or is it the result of other stuff?

Alexey: It's actually only a part of the question. [laughs] There’s more. It's more like a comment than a question, but it says “Most data science applications take time to provide value.” It takes time – you launch a project – maybe it fails, maybe you need more iterations before it actually starts showing impact. That's why they are costly. How do you make sure that the team is working on things that potentially, maybe in half a year, will be impactful, but right now, maybe not so much? (48:08)

Barbara: Yeah. That's why we need monitoring. Always. In each data science project, we need monitoring. We need monitoring of models and of the business processes. For example, with sales forecasting, you monitor “Are the sales better or not?” But we should remember that not everything depends on the models. Different AI solutions could do something, but we also have many different business processes, which may influence this process. So we should take a more critical approach to this analysis. It’s not that “Oh, my model works very well because the sales are better.” Yeah, well, maybe the sales are better because it’s December and Christmas time and we sell toys, for example. But that's why it's very important to have monitoring of this process. (48:43)

Alexey: Yeah, so you need to know that the improvement you see is because of the model, and not because your sales department just started to work better. Right? (49:47)

Barbara: Exactly. That's why it's quite tricky to analyze results. (50:04)

Questions before starting a project

Alexey: We have another question that’s quite interesting. “Do you have a standard set of common questions that your data scientists use to interview the clients?” (50:12)

Barbara: Questions for the clients or during an interview? (50:24)

Alexey: For the clients. (50:26)

Barbara: Yeah, I have some basic questions. I always ask what kind of problem they have in the business. Some clients come to me and say, “Oh, we need machine learning, or we need AI, because we heard that machine learning will do some magic stuff.” And I always say, “Okay, but what kind of problem do you have? Because maybe you don't need machine learning – maybe you need only a moving average? Or maybe you need a good dashboard?” So this is the first question. Another question is always, “Do you have any solutions for this problem right now?” Because maybe now they have sales forecasting and they have the moving average. Very simple. Okay. I ask about this because I have to know if they have some baseline or not. That is the second question. (50:29)

Barbara: Other questions are usually technical, such as “Which technologies do you want to use for the solution? What kind of data do you have? How much data do you have? How do you work with this problem right now?” Maybe they have some experts and they analyze everything in Excel. Or maybe they have some expert knowledge – some wisdom or something like that – and this is not in data. This is not in Word, Excel or in other systems, but it's only in the knowledge of these experts. So that's why I also ask about this.

Barbara: I ask a lot about their data and what they know about the influence of this process on the problem. Let’s use sales forecasting as the best example, “What do you know about your sales? How does it work now? So you sell this every day or every month? Do you have some peaks in sales or not?” So, I ask how it generally works. This can be a quite long process of questioning.

Alexey: Or maybe “Do they have any KPIs already to see how the process works?” (52:56)

Barbara: Yeah. Sometimes I also ask, “Do you expect some specific KPIs or metrics?” Because sometimes they could say, “Okay, we need uplift across the position of 70-80% because this is the best solution for us.” Or something like that. I ask a lot of data – how much they have, how it works, if it’s clean or not? What do they know about the data? Yeah, I think those are the most important questions for the beginning. (53:00)

Alexey: The first question, “What kind of problems do they have?” As you said, maybe they don't even need machine learning. How often does this happen – that they actually don't need ML for that? (53:43)

Barbara: For me it’s not a common situation. But the more common problem is that people don't have good data – or not enough data. This is more problematic. Maybe my customers are quite aware and they know that they need machine learning for their particular problem, because right now they use something simple. So it doesn’t happen very often, but it could be a possibility. (53:57)

Women in data science

Alexey: Yeah, thanks. Maybe this one is also a tough question. “Do you have any thoughts on why females are still a minority in data science teams?” (54:31)

Barbara: Okay, so from a totally different part. [laughs] Yeah. For me this is important because I also help in an organization – Women in Machine Learning and Data Science. (54:43)

Alexey: Yeah. That's why I thought you would like this question. (54:55)

Barbara: Yeah. I like this question. But maybe it’s not so connected with managers and experts. (54:58)

Alexey: But maybe it is actually connected. A data science manager actually manages teams, and how can you, as a manager, do something about that? (55:06)

Barbara: Ah, it doesn’t exactly depend only on me. About two weeks ago, we had a discussion panel in Poland. We organized this with the organization and with other companies. We discussed what data science looks like from a woman’s perspective. The problem is that many women are scared about the job interview. That's why sometimes I don't have a choice in the matter. But I think that more and more women want to be data scientists and data analysts – and generally work with data. (55:17)

Barbara: It's a good situation that I see that many girls, many women, like working with computers and like to work with technical things. They don’t have such a big fear about doing it. All we can do is talk about it and we can spread the word. We can spread awareness that it's not so hard. In general, it’s a hard job – but it doesn’t matter if you are a girl or if you're a man. It depends on your personality and your skills. Everybody has some advantages and disadvantages. It doesn't depend on the gender.

Alexey: Yeah. You mentioned that women are scared of job interviews. Maybe you, as a manager, can give them some advice? How can they be less scared, if it's possible? (57:11)

Barbara: How to be less scared? For me, it was always very helpful that I talked with other people – my friends, or my family – and they encouraged me and said, “Okay, everything is good. Don't worry. This is not the last chance. Don't be afraid. If not this job, there will be another one.” At this moment, the level of stress becomes lower. Another thing – everybody, especially women, should be prepared for different questions. They should tell themselves “Okay, I can do it. Maybe I don’t have all the requirements. Not 100%. But 80-70% of the requirements are good enough.” I say this because I see that men don't have problems with this and may have only 70% of the requirements for the job. But for women, this is a problem. I think this is the most important thing. “Okay, maybe I'm not a 100% match for the job, but I match 70%. Okay, I could try. Maybe not this job. Maybe the next one.” Don't give up. (57:26)

Finding Barbara online

Alexey: Yeah, thanks. How can people find you? (59:03)

Barbara: I think the best solution is LinkedIn - Barbara Sobkowiak. I think that’s the social media that I use the most. (59:07)

General advice

Alexey: Do have any advice for anyone who is listening or watching this? (59:20)

Barbara: Any advice? Okay. So if you are here and you're looking for a direction to go in your career, you should ask yourself, “What gives you the most satisfaction?” If you have a problem answering this, you should ask your friends or your colleagues, and you should find a mentor. I think it's easier if you find a mentor, and talk with him or her and ask him or her about everything. So talking with people and networking. (59:33)

Wrapping up

Alexey: Yeah, thanks. I hope that the questions didn't catch you off guard. They're fun. I think you enjoyed them as well. (1:00:24)

Barbara: Yeah. (1:00:30)

Alexey: Thanks a lot. Thanks for joining us. Thanks for sharing the experience. Thanks for sharing your stories with us. And thanks, everyone, for asking questions. We got quite a few questions. That is there are still six questions that we didn't cover. So, apologies for that. I think that's it. We should be wrapping up. Thanks, Barbara. (1:00:33)

Barbara: And thank you. Thank you for the invitation, very much. (1:00:56)

Alexey: Enjoy the rest of your day and have a great weekend. Good bye. (1:00:59)

Barbara: You too. Bye! (1:01:02)

Subscribe to our weekly newsletter and join our Slack.
We'll keep you informed about our events, articles, courses, and everything else happening in the Club.


DataTalks.Club. Hosted on GitHub Pages. We use cookies.