Building Business Acumen for Data Professionals

Links:

Did you like this episode? Check other episodes of the podcast, and register for new events.

Transcript

The transcripts are edited for clarity, sometimes with AI. If you notice any incorrect information, let us know.

Thom’s background
Importance of business acumen
Don’t try to be the domain expert — be a data scientist
Deliver crap as fast as possible
Tracer bullet
Improving ETLs
Data governance and data collection
Data-driven vs Data-informed
Invest in analytical skills
Machine Learning development pipeline
Scaling features and feature engineering
PCA and collinearity
The most important business skills
Networking: lunch and beer
Lunch and beer during COVID
Integrated machine learning and AI community
Wrapping up

Alexey: This week, we'll talk about business acumen. We have a special guest today, Thom. Thom is a data scientist with a PhD in mechanical engineering, a multiphysics engineer and a Python-loving geek who lives in the United States. That's probably not a very long bio, but Thom knows a lot of things. I watched a lot of different events with Thom and I know that Thom can talk about pretty much everything – any data-related topic, Thom knows something about it. It's a great pleasure to welcome you, Thom, today on this show. Hi. (1:15)

Thom: Hi, Alexey. I'm really glad to be here. I want to qualify something you said. I know a little bit about a lot of things, but I don't know a lot about a few things. Or maybe I know a lot about a few things – I hope so. But one thing that helps a lot, and I always like to emphasize is, you can learn a lot more if you focus on concepts rather than details. (1:50)

Thom’s background

Alexey: Actually, I wanted to invite you to this event for a long time. But I was struggling to think what we could discuss. Usually these talks have a topic – it’s a themed discussion. The discussion can be about something such as business acumen, as is our topic today. But I knew that you could talk about anything. We talked together about full-stack data scientists, and I saw you talking with Danny about pretty much everything. Whatever the audience wants to know, you were there being able to answer questions about it. I also know you host events talking about transformers and other very theoretical things. Then there are things that you share on LinkedIn – whether it’s SQL or machine learning comparability theory – basically a whole range of different things. But I finally found a topic – business acumen. (2:18)

Alexey: Before we go into this main topic of business acumen, I wanted to ask you to tell us in a few words about your background. Can you tell us about your career journey so far?

Thom: Absolutely. I was born and raised in Dallas, Texas. I had the honor of being senior class president of my very large Dallas high school. I was an awkward jock growing up – a competitive swimmer. Then I went off to the big bad University of Texas at Austin to get a Bachelor's degree in mechanical engineering. Soon after that, I went into my nation's naval nuclear program and I went through the same training that an able ensign would go through, but I was working for one of the contractors that serve the naval nuclear program. When I arrived at the prototype training on real Naval Reactors – they were land based-back then – I went through the same training with the naval officers. But when we completed, they went out to sea, whereas I stayed at the Platte and helped with the operations, maintenance, and training of the new naval personnel coming through. I wasn't happy enough with that, so I went to grad school. (3:21)

Thom: For a while, I managed the research reactor at Texas A&M University, and then got my Master’s and PhD in mechanical engineering there. My Master’s focused on robotics and further research, while my PhD focused on design and modeling of hybrid electric vehicle power plants. That's when I really started to dive into data science-type topics. We didn't call it data science back then. I was learning neural networks… What's that?

Alexey: What did you call it back then? (5:09)

Thom: Just whatever sub-topic we were learning. I first realized, “Oh, there are limits to what we can do with classical physics-based modeling and control systems design.”One of my controls professors was teaching fuzzy logic and expert systems – I loved that course – then I started studying neural networks. There was no Python back then, so we were coding this up from scratch, and C or whatever, and doing our own memory allocation. We did some pretty cool things back then with that, and I ended up needing some AI in the modeling that I did for my PhD. Then I moved to the Boise area, way back in ‘97. (5:11)

Thom: I started out as manager of advanced products for a small company that made automated wet-action cleaning tools for the semiconductor industry. But I felt like the leadership was going to bury the company, so I moved over to HP. I was there for almost 17 years. Then I moved over to ON Semiconductor. Over the course of that, I mentored a lot of younger people, either working on their advanced degrees or needing help with modeling things. Of course, we were very test-centric in laserjets, but I did a lot of other things there too. At ON Semiconductor, that was really like a factory of data science-type work – algorithm development, primarily – but I was able to try out and achieve some really cool machine learning there. That was a combination of unsupervised, but where the model could figure out its own labels on the fly.

Thom: Then, most recently, I was at UL as a lead data scientist for their prospector SaaS. Prospector serves up a database through their SaaS – it's the world's largest database for plastics, paints, coatings, personal care and cosmetics. I was developing an AI to process unstructured data for them. Now, I just took some time off to look for a new job because I realized I needed to move on. So that's where I'm at.

Thom: Probably the thing I'm most proud of, Alexey, is that about a couple years ago, young people on LinkedIn started reaching out to me for help. To make a long story short, my help (shocking to me) was considered very helpful. My followers grew and I ended up being overwhelmed by one-on-one mentoring requests. So, now we do a thing we call ‘integrated mentoring’. It was originally the name of my blog, ‘integrated machine learning and AI’, but now it's a community. We're approaching 1000 people in our Slack workgroup. We have Saturday morning integrated mentoring, me and one of the other gentlemen who's now my best friend in the world, Ghaith Sankari – we're writing a book together. We're also teaching a course for free to test out our book material. I also have a couple of Tech Time chats each week. So we're still figuring ourselves out and growing. But we have a motto, called “more together”. We just all want to grow and learn and we're helping each other do that. That's our spirit.

Alexey: Yeah, that's very cool. I'll make sure to add any links that you will send me about the course you're doing. You mentioned a book, you mentioned the community, LinkedIn. I'll make sure to include all that. (8:52)

Importance of business acumen

Alexey: Coming back to our main topic – business acumen. Before this meeting, I looked up what ‘acumen’ means and the definition I got was “the ability to make good judgments and make quick decisions”. So, I wanted to ask you, “What is business acumen? Why is it important for data professionals – for data scientists and others?” (9:12)

Thom: I want to tell the audience that what I'm about to share is stuff that I learned the hard way. Quite humbling, I should add. When it came to leading a group of engineers in design, and applying design methodology, I felt pretty skilled at that. But when it came to really designing a product for the market, I realized I had a lot to learn. The first thing that helped me a lot was discovering that, after I'd learned some great design methodology, and I'd learned the six thinking hats for managing egos, and debates that are prevalent in my culture, I came across a gentleman named Steve Blank out of Stanford. He was a professor there at the time. Now he's a consultant, I think he's retired from being a professor at Stanford. But he really got my attention with the spirit of customer-centric design. He just was at a very high level to get the spirit of it across. I would say it's this. (9:36)

Don’t try to be the domain expert — be a data scientist

Thom: By the way, I'm speaking to business acumen in the sense of “How would a data scientist fit into exercising good business acumen?” Well, the first thing I would say is, “Don't burden yourself with being a domain expert, or a subject matter expert – or having exceptional business acumen at the company or organization you're serving.” Why do I say that? I'm not saying don't try to get good at it. But if you go in and tell a leader of the business what they need to hear from you. They're going to look at you like, “I run this business. I know what the concerns are.” So I encourage all data scientists – don't go in that way. Go in with “What are your biggest concerns? What are the things that cause you to be afraid? What are the things that keep you up at night? I want to know those things. Then what I want to tell you I'm going to do – I'm going to go look at our current data assets and I'm going to create a matrix. I'm going to look for the strongest intersection of ‘business need’ and ‘current data assets’. And if there's a business need that was really high that we didn't have adequate data assets for, I will encourage collection of the data we need to answer questions for that. But I want to get started right away!” Then it's a spirit. Now let's bring Steve Blank into the picture. (10:51)

Alexey: Sorry for interrupting you, but the advice you just mentioned is a bit controversial. You're saying “Do not focus on being the best domain expert, because this is not your job as a data scientist. Instead ask your management about, ‘What is important?” Do I get that right? (12:25)

Thom: Exactly. Because, how can you be a really good data scientist and a domain expert? In fact, when I'm in an interview where someone seemingly wants me to show that I'm a domain expert in their domain, I am almost tempted, Alexey, to say, “I think I'm in the wrong interview. I was interviewing for a data scientist position. The kind of questions you're asking me right now – I would create a framework and go fill it in by talking to the domain experts. Because how can I be a domain expert and a data scientist? I want to serve the domain experts and the business leaders.” And if that's controversial, I will continue to be politely controversial on these points. (12:46)

Alexey: It makes total sense, I think. (13:36)

Deliver crap as fast as possible

Thom: I think it's important to improve our business acumen in the domain that we're serving. But let me say this. You've heard the adage “80% of machine learning models don't make it into production”? Well, my first question, and I speak on this with my dear friend – we call each other brothers, Greg Coquillo, he's a technology specialist manager at Amazon – we make the point that, “Hey, let's take the focus on that. What we're really looking for is a return on data. What if, as we're looking for opportunities to develop machine learning models, we've gained so many insights along the way that now we have a good data story? That data story can help the business.” Well, we're going to deliver that before we worry about putting a machine learning model into production. So let's take the focus off of ‘how many machine learning models get into production’ and let's have the focus on ‘how much value are we deriving from our data?’ (13:39)

Thom: When we talk about that 80% that we do on data munging and preparation, before we get to the modeling phase in the development of the machine learning pipeline – I would say 80% of the insights that you can get back to the business come from that 80% at the machine learning preparation. For example, I may not always use linear and logistic regression for my machine learning model that goes into production, but I will always look at it. Why? Because it gives me a great Pareto of feature importance. Well, wouldn't that be as important, maybe more than predictions from a model?

Thom: I mean, yeah – you can react to predictions from a model. But you can proact and understand what features are most important. You think of the wisdom of the Gallup strength-finder poll – they're like, ‘exploit your strengths’. Well, if we know this feature’s most important business-wise, we can focus on the variables coming in for that feature. How can we get that even stronger? How can we get the variables more toward the positive end of that? Or maybe there's a negative feature that's really strong. That's important knowledge too. How can we reduce that so that we get the performance we want? But back to Steve Blank. As you're developing your pipeline, I think (and I suck at this, but I'm trying to get better at it) deliver crap. As fast as possible. Let me say it again. Deliver crap as fast as possible.

Alexey: Yeah, thank you. (16:37)

Tracer bullet

Thom: I'm leveraging a little bit from Dave and Andy, the Pragmatic Programmer. They talk about tracer bullets. Can I use bad language on the show? (16:38)

Alexey: Um... [laughs] (16:51)

Thom: Just a little bit. (16:52)

Alexey: A little bit, OK. [laughs] (16:53)

Thom: OK. I have a term, but I'll modify it a little bit. Y'all will know the word I would have used. I believe now in a ‘crappy tracer bullet’. That means I want an end-to-end solution. It may not have all the features I wanted to have, but I tested it end-to-end. I'm just making sure the end-to-end works. Now I can build the thickness of that process and get all the features in it that I wanted. But even before you get that end-to-end, if you get some insight and you're doing customer-centric development – again, your customers can be a combination of your domain experts and your business leaders – have the spirit of getting frequent feedback during the course of your development. (16:56)

Thom: Now, here's the other part of getting a better return on data. You're marching toward this, and you're helping the domain experts and the business leaders understand what you're doing and why. But if you're really going to get a good return on data, that collaboration needs to go both ways. You've been working to understand their needs – they need to understand why you're doing your processes and why they're important.

Thom: Again, we're not asking them to become data scientists or data evangelists themselves. We're saying, “Appreciate what we're doing. Appreciate why we're doing it.” Because then our collaboration will be optimized. It'll be much better, instead of you just saying “go off and do this” and then we try to do it. No. There's so much more value in the actual process – to me understanding what's worrying you and you understanding why I'm serving your questions, your needs, the way that I am.

Alexey: Let’s see if I can summarize what you said. You said “Deliver crap as fast as possible.” Then you show it to your end users – to the business – and then you get feedback. So you really want to use this crappy solution that you develop pretty fast to understand if this is serving their needs. You want to get feedback from your users. You want to get insights from your users. You learn a lot just by doing this. Right? So you don’t put a lot of time and effort into the development of your solution, but you're learning from what you get as a result – from the feedback. And then you iterate. Right? That's the essence? (18:52)

Improving ETLs

Thom: Yes, exactly. Let me give you some ‘for instances’. Okay. The first thing, when you're doing your ETL – extract, transform, load. I don't like that acronym, but… it's not a bad acronym, but just the spirit of “Yeah, we're getting the data.” Well, was that very smooth? If it's not really reliable and easy to do that, you need to go ask the organization, “Hey. Just some constructive feedback – it would make it better for our ETL work and my data science group if y'all could fix these things.” Now we're marching along. We start to visualize the data and we start, obviously, to discover some of the dirty data. (19:32)

Thom: A lot of data is collected by data management systems of some kind – where a programmer has made sure that data coming in will go into SQL safely and it provides an insulation of the database too. Well let's say you're getting a lot of null values. Why? “Null values/missing values are not good for me. Can you make sure people have to enter a value for this field?” “Oh. We don't like to do that.” “Well, it's costing us money. And let me make the case. I would say, this is where we need to be the most adamant – strong-arm it. Just say, “No, this is not good. Data is our platinum. Data is our most valuable asset. You can't just willy-nilly allow people to not enter a field here.” “Well, can't you automate your missing…” “Yeah, I could. But it's not going to help our modeling. It's not going to help our data storytelling. Just make sure this field can’t not be filled.” I'm not saying you would normally get into an argument like that. I'm just saying be prepared to be very strong about clean data.

Data governance and data collection

Alexey: Do you do this as a data scientist? Or do you go to your manager and then your manager does that? As a usual data scientist, you don't always have the possibility to go to the users who are not entering the data and then make a case for them. (21:39)

Thom: It depends on the size of your business, your company. It depends on the culture you're in. But here's the spirit. Who does data governance belong to? I think it belongs to everyone. If people are data literate, and they understand the power of data – who is not going to care about clean data? People that don't care about clean data are people that don't know how important data is. How it affects the profits of the company. How it affects the decision-making abilities of the company. So, data governance is everyone's responsibility. A big part of that is making sure we have clean, complete data. Now we continue down the pipeline. Hopefully, we've instigated – or initiated – some better data collection practices. We might also realize, “This is data we don't have. We need to start collecting.” And so we get a collection effort going. And I agree with you, Alexey – we really want to kind of hand these things off to the manager. But if we are the manager, we want those lead efforts, of course. (22:00)

Alexey: So for interrupting you, I'm just trying to connect these null values to business acumen. For us, when we see “Okay, there is a null value.” We don't just say, “OK, whatever. Null is null.” Instead, we ask ourselves “Why is it null? How many null values are there?” We try to relate this 4-byte record in a database to something real. “What does it mean?” You try to understand that. Then it turns out that it’s just the people being lazy. They don't want to fill this field even though they are supposed to. Then you get this business understanding. Is that right? (23:11)

Thom: Yeah. But let's think of it this way. A lot of people are correctly shifting from saying ‘data-driven’ to ‘data-informed’. I kind of like the Tony Stark mentality “Too much to ask for both?” You know? I mean, market analysis causes us to be data-driven, right? But when we're talking about data science, more often – yeah, it's ‘data informed’. So, I would like businesses to think that you still need to be data-driven, but as far as your data scientists, that's probably more being ‘data-informed’. (23:52)

Data-driven vs Data-informed

Alexey: How would you define those terms? ‘Data-driven’ and ‘data-informed’? (24:29)

Thom: Those are two different philosophies – that your business should be ‘data-driven’. For example, “Are you doing market analysis to make sure you understand why you're taking the direction you're taking? Have you really done enough analysis to know what feature needs improvement right away? Things like that. Whereas ‘data-informed’ is a separate realm of saying, “OK, the business leaders have the business expertise.” They may have a different group that's helping them understand what initiatives to take. But then at that point, the kind of work that data scientists and machine learning engineers do – data storytellers – is we're informing the business with data. We're giving them information that’s very refined, even to the level of the prediction. (24:33)

Thom: Back to your earlier point – how does making sure you don't have a missing value in a critical feature relate to business acumen? You’re now not being as well ‘data-informed’ if you have missing values for a critical feature. And that's my point. Really, this is a system. Business is a system and your data scientists are part of your feedback for that system. That's how you stay data-informed. Think about a feedback control system. If the data scientists, the data storytellers, the data evangelist, the data specialists, aren’t monitoring the data performance of the company, how do we have feedback to know where we need to improve the business? But also, where does the business go? I personally think it's a little premature to take the focus off of being data-driven in business, because – how are you going to know where to drive your business without data? It may be a different type of data than what your data scientists work on.

Thom: If you want to improve business acumen, that's not just a burden on the data scientists to understand the business. It's also a burden on the business leaders and domain experts to understand how data informs their decisions. It's a complete cycle. It's a complete system where this isn't just about us having better business acumen, but it’s also about the business leaders that are the business experts in proving their business acumen, but with data. This has to happen in collaboration with their data evangelist.

Thom: By the way, I like to say data evangelist, because there's a lot of valuable data specialists out there that would say, “I'm not a data scientist.” But when you look at what they do – they're doing at least 80% of the work that data scientists would do, but they're doing it very thoroughly in a very explainable way. So I'm always encouraging people that come to me for mentoring, “Oh, you want to be a really good data scientist? Then become a really good data storyteller.” That's a great way to start, “Oh, you're afraid of the math right now? Get good at Power BI and processing data and telling stories with data.” That will make you a better machine learning engineer or a better data scientist.

Invest in analytical skills

Alexey: So basically, invest in analytical skills. “How can you make sense of this pile of data? How can you crunch it, summarize it, and how can you visualize it in a way that is digestible and understandable even for folks outside of the data organization – for business people?” Right? (28:09)

Thom: Exactly. Let me ask you a question to prove the point. Alexey, do you think we always need a machine learning model to give value to the organization about the data we're analyzing? (28:29)

Alexey: Yeah, well. Probably not. (28:44)

Thom: Sometimes good data storytelling will do the job. And that's something we can deliver very quickly, while we continue to work on a model to see what additional value might give. (28:47)

Alexey: Yeah, and to your point about null values and then convincing people to actually fill these fields – you mentioned that you can build a case to show how important it is – instead of coming there and screaming, “Hey, you must fill this in! You're responsible for the quality of data!” Instead, you show that “If they don't fill it in, this is how much money you will lose.” Or “If you fill it in, this is how much uplift you're getting.” Right? (29:00)

Thom: Excellent point. (29:29)

Alexey: So this means being data-informed, right? You show a story – you show that “This is an important feature, but it's missing in 90% of the cases. If it wasn't missing at all, then this is how much money we will get.” Or “This is how much uplift in some important business metric we would get.” This is being data-informed, right? (29:30)

Thom: Exactly. Absolutely. But let me give you another. For instance, let's say you're having a polite discussion with your programmer that manages this data management system for controlling the way data goes into a database. I've seen cases where there was a null value – meaning a missing value – and when you really studied the set and you looked at more of the data, you realize, “Oh! It's missing because – it's just implied that when that's missing, it's because that item doesn't exist.” But I still have to fix that and it's still an uncertainty, “Did someone just forget to enter that or is it really because that doesn't exist for this record?” It still costs time for the data scientists where, if people care when they're entering the data to get it complete, it just saves everyone time and it makes the data more crisp and more clear. I don't think we'll ever get rid of the need to clean data. But if data literacy improves enough, at the level that you and I are talking about, we can at least reduce that data cleansing work to a more modest level. (29:54)

Alexey: I keep interrupting you. (31:17)

Thom: No, that’s okay. (31:20)

Machine Learning development pipeline

Alexey: We were talking about ETL. Maybe we can go back to that. We get the data, we see how smooth this process is, and we try to optimize it. Then we get the data, we visualize it, we find dirty data, and we try to fix it. What’s next? I imagine this was some sort of sequence and we stopped at visualizing. So what is going to be next? (31:21)

Thom: Yeah, I call what we're walking through right now ‘the machine learning development pipeline’. Obviously, once you've completed this development, you're putting a completed machine learning pipeline into production. The next stage after you've cleaned is to condition your features. Let's remember, all machine learning problems are math problems. Really, all data science work is math problems. They say, “Oh, wait! What about NLP?” Well, to do NLP, you're converting all the words or tokens or whatever you want to call them, into math – into numbers – so that you can do math with it. Yeah, based on some index to word after the math machine has processed everything, it gives it back as words – sure. But what's going on in the bowels of the math machines that are hooked together – say in a big, giant transformer? That's all numbers. (31:44)

Thom: So feature conditioning. We need to convert text to numbers. That can be as simple as one hot encoding or ordinal encoding, and then it can get into tokenization of documents, etc. Then there is all sorts of processing that goes with that. So this is all feature conditioning and it can get quite complicated. For example, in a modern transformer, the tokenization isn't written by a human, it's learned by the machine. Then it even adds positional encoding now. It's mind-bogglingly elegant and beautiful the way it goes about it. It's embarrassing to see what the machine comprises for tokenization because of how inconsistent we are with our language, and how consistent it's trying to be with the language that we're feeding it.

Thom: But then, once you have your features conditioned well – and you may have to condition your labels too, many times. Now we need to say, “Well, I've got all these features, but which ones are important? Which ones aren't?” That's feature reduction. I actually would prefer to call it ‘feature set optimization’. But we've all said ‘feature reduction’ now. It's not the best term, but it’s just the spirit of “I need this set of features that are important to this prediction. I don't need more, and I certainly don't need fewer, I need this specific set.” And there are some cool arts to determining that.

Alexey: So, that's how you develop business acumen – one of the things as well. You look at the important features, and you try to map it again to the business problem. Even though it's math, you don't look at it as just numbers, you're trying to connect it to whatever business problem you're solving. Right? And this is how you develop your domain expertise in a way. You start to understand “OK. This is what it means and this seems important. So let's keep this feature.” (34:25)

Scaling features and feature engineering

Thom: This is a great point you're making. Let's think of it this way too. It would be a mistake to go to your domain experts and say, “What are the features?” and trust that blindly. Instead, I would say, “What do you suspect the features will be?” But then I will come back and show them. “Hey, you were right. I'm finding that these are the features.” But I might say, “What would you say if I said this feature had this relative importance?” They go, “Wow, that makes sense. But I hadn't thought of that.” Now they're being data-informed. The business leaders know right away, “These are the features we're finding important for this dynamic – for this modeling. Wow, that's helpful. Thank you.” (34:54)

Thom: But that doesn't mean our work’s done. Also – scaling. I'm not sure I said that. Why do we need to scale our features? To get them on a level playing field. I can imagine some purist, which I tend to be, thinking, “Well, if I scale the feature, isn't that changing it from the original values?” and so, “Well, if I start reporting feet in miles instead, because there are so many feet? Does that concern you? As long as I have enough decimal places?” “No.” “Well, if I convert anything to just a new scale, is that really taking away from what the features are telling us?” I guess not, when you think of just changing the scale of a unit from miles to feet, or meters to kilometers, or whatever. But that's what we're doing. Because if we don't get those features on a common numerical scale, then when we go to find the weights of the features using modeling – and this is just initial modeling, it's not necessarily a model we put in production – how would I then know the relative importance of those features?

Thom: The big number ranges are just going to have huge weights or small weights. And the small ones that are important might have an inflated one. So if we put them all on the same playing field, the scaling becomes essential. It's an absolute prerequisite to being able to understand feature importance. But going back to feature reduction, or getting that optimized feature set – that doesn't mean we're done. Now we have to say, “Alright. What feature engineering might we need?” I've done a lot of experiments with this. If you get an engineered feature from an original feature, and it's not interacting with other features – that's highly collinear, by the way. You need to be prepared to not be alarmed by that. To me, there's this ongoing debate in my mind, “Should I do feature engineering before feature reduction?” And all I can say is, “I think it's gonna be acyclic.” And it just depends on each individual problem.

Thom: But there are some times where feature engineering is absolutely important. You can even drop the original feature in favor of an engineered feature. For example, I've got some values as a feature. Now I try the square of those values and I try some different modeling. I find that when I take away the original feature and only use the square of those feature values, it's much more accurate. That's okay. It happens. But sometimes there are high degrees of interactions and high orders of feature engineering, in order to model what we need to model. And that's very insightful too.

Alexey: Coming back to this feature engineering and creating new features. I think that by doing that, you also learn more about the business. For example, I work at OLX, an online marketplace – we have sellers, we buyers and they exchange goods on the platform. What we track is the number of chat messages – we can see that it's just a number. For example, this person had that many messages come in or out. But a good feature there was the number of meaningful conversations between people. ‘Meaningful’ meaning somebody sent a message, somebody replied, and somebody replied again. There was some back-and-forth communication. (39:14)

Alexey: So this is an engineered feature. We looked at the raw data and we created this feature called ‘meaningful conversation’. This gives us a lot of knowledge about what happened. By engineering the feature, we were able to understand the business process better. How this data actually affects our model. There is also some meaning behind this feature. These people had an actual conversation. It wasn't just “Hi” and nobody replied, but they actually talked to each other. Thus, we learned about the process. I think this is important when you create such features.

PCA and collinearity

Thom: That's awesome. I agree with you. Then once we've engineered the features, we still have other questions before we go into choosing models. For example, a lot of times, the way we experiment with engineered features is to throw a lot of polynomial features into the mix. With the polynomial features class from SciKit Learn. Well, now we have to go through that process again, of saying, “Which of these engineered features might I be able to drop?” And we've got great tools to help us with that analysis. Then, if your problem’s really sticky in that it's really hard to get rid of the collinearity from the original feature set and it's really hard to get the right interactions for the engineered features – we have this magic tool called ‘principal component analysis’. I've been a fan of that since grad school. (40:46)

Thom: We use those kinds of things in robotics and electrical engineering – in other words, electric circuits – we use it in vibrations analysis, control system design… because there's not a dynamics field where you don't care, (it doesn't have to be just dynamics) about the eigenvalues and the eigenvectors. They're very informative. They tell you the singularities that can happen in your system. For us, as data scientists and machine learning engineers, PCA becomes super important. I want to emphasize, being in PCA space is not the same as being in the original feature space. It's a totally different perspective. You can relate the two through the eigenvectors – and you want to do that. Let's say you end up using PCA. Now, the burden of explanation is harder on us as data scientists and machine learning engineers, because we got to say, “This was the most important PCA feature.” But when we transfer that back to original space, it's a combination of this proportion of original space features. Hopefully, we've educated people enough and explained it in simple layman's terms to our domain experts and business leaders to help them understand, “We went into PCA, kind of as a last resort. We went into the Eigenspace as a last resort.” Because it's harder to explain – it's not impossible – but we did it because there was such messy co-linearity.

Thom: My family and I, of integrated machine learning and AI family members, like to explain it this way. Co-linearity is just – let's say you have a football team (I'm speaking internationally, a soccer team for Americans) and you're saying that there are two guys playing the same position. Well, what happened to the position that one of those guys should have been playing? Now there are three guys playing the same position. “Hey, we need to fix that.” We just want one feature, emphasizing each important aspect of this problem. When we have co linearity, it's saying “No. Those are too closely related. We don't need all of them. We just need the strongest one.” Well, when it's really hard to divide that up, PCA magically decouples all that.

Thom: It gets rid of all the collinearity. But it also allows you to do this fancy thing that's actually quite simple, called parsimony. So we're just saying, “Hey, we always want the simplest model that does a great job.” And I've violated this principle sometimes too, going for a more complicated model. But the spirit is – it's really elegant and easy, because you can say, “Well, which eigenvalues are very small?” Then we obviously don't need those PCA features related to those eigenvalues. Now, we've reduced the problem, we've got it all decoupled, and it's quite easy to do the feature engineering in the Eigenspace too. But again, that's that magic space. Then we go into the modeling room.

Alexey: I wanted to say that this is the first time – I've been doing this podcast for something like 10 months, almost a year – this is the first time somebody mentioned these terms, ‘eigenvalues’ and ‘eigenvectors’. And it's funny, because this episode is about business acumen. [laughs] (45:35)

Thom: Oh, down to it. You have to throw PCA into the mix, and you're trying to give insights from each step in the pipeline back to the business. You do have a burden to say, “Well, we did the analysis that we explained here, but we ended up using PCA.” The business leaders may not care. And when they don't, I think shame on them. They should at least appreciate why we had to do that. If they're saying “No, I don't want to hear that.” That's regrettable. You should be prepared to explain it in layman's terms. But when we get into the modeling, we have metrics to help choose the models. (45:53)

Thom: Don't fall into the accuracy trap. You want good accuracy. But you want consistent accuracy. In other words, it's a generalized model – it'll work across a lot of data, (or as much data as possible). So we use, obviously, the method of proper validation for that. But literally, we want to see a distribution of accuracies across those folds with the least variance. When we find the one with a good balance of high accuracy and low variance – that's our model. Not your favorite algorithm, but the one that, generally, has that best balance of accuracy and generalizability. And all of that's important.

Thom: It's important for the business leaders to understand that we have these battles to figure out which model is best. Then let's say you do get a machine learning model into production and it's informing the business. And the business acumen of the organization has increased through this model. The work’s ongoing. Its constant human overwatch to say, “Okay, we're collecting more data assets. We hadn't approached the central limit theorem yet with our data assets and these feature groups. So we're gonna have data drift. You know what? We may have to adjust the hyper-parameters on this model. We might need a whole new model algorithm to do a better job based on the shift and data assets.” Now, when might that go away? Probably never, but if you approach the central limit theorem and there's no data drift for years – that model can be useful for years.

Thom: But things usually do change in society. In other words, the central limit can start to change itself. Or you can have concept drift. What if a new feature is introduced into a process? That's concept drift. That may mean that “Oh, I have to add a new feature.” So we're constantly, as much as we have bandwidth to do so, we’re challenging “Is the model in projection continuing to be the best model for production?” Now, I would just submit, from a data science point of view, them feeding back this info, them educating the greater group on why data science is important and why machine learning is important – this improves business acumen not just for the data scientists, but for the business leaders and the domain experts.

The most important business skills

Alexey: Yeah, that's interesting. I was thinking now – of all the things we discussed, I'm trying to summarize it into the most important business skills for data professionals. So first, we mentioned data storytelling – being able to analyze the data, crunch the data, and present it in a digestible form to people who might not be in the data world or might not be data scientists – such as business people. Then there’s explaining things in layman's terms, even if it's PCA. But you still need to be able to explain what's going on – all these terms like eigenvectors and eigenvalues – you need to be able to translate it into a common language that people can understand. (49:28)

Alexey: Then there’s educating why data science is important, which is something that you, as a data professional, should also be able to do. For instance saying, “Okay, you really need to be careful about this value. You really need to pay attention and not forget to fill it in.” Then you must be able to show convincing arguments of why they need to do this. So what are the other important business skills that I missed here?

Networking: lunch and beer

Thom: Well, I'm laughing because of something I learned the hard way. I'll share it as a story because I think it'll be more powerful that way. I'll just use first names, but they're the real names. My buddy Rick – he's VP of this business in this big company – I sent him an email about an idea I have that's a data science-related and experimentation platform-related case that would help us understand a lot of things we've been trying to understand better. But I made the fatal mistake of CCing some other people. Now, I'm just an engineer. But Rick's my buddy, he's a VP. But all these ants act like their anthill was stepped on. Another distinguished technologist came in the middle of a conference and dressed me down for sending that email and I had an appropriate reaction to him. Long story short – my whole team rallied around “trying to save Thom from this email he sent”. (50:42)

Thom: We gave this big formal presentation that was cleansed and everything. Later, I see Rick in the hallway, after he's asked a lot of challenging conversations and listen to this dog and pony show presentation. “Rick, all I really was trying to get at is – why can't we just experiment with these devices that we own and just leased to the people? They're ours. We can do some small factor analysis experiments with open stuff.” He said, “Thom, I've had that same question.” So here's what I'm trying to get at – and this is all you need to remember lunch and/or beer. And do it frequently.

Thom: In other words, “Rick, can we go to lunch?” “Rick, let's go out for a beer after work, I want to talk to you about something.” That would have been far more productive. We could have had a lot of micro-meetings, rather than me making the mistake of sending a formal email and CCing a lot of stakeholders. No. Just try to be culturally savvy too. Make a lot of these check-ins. You're frequently seeking feedback – make them very informal. “Hey, I just want five minutes of your time. I want to show you something.” And then always ask for help, like, “I think I can communicate this to you in layman's terms. But I may fail to do that. So I need you to let me know when you're not really getting it.” That way, we're always in a continual improvement mode on how we talk about data science to smart people that aren’t data scientists. But we're doing it in a very informal, frequent way.

Thom: Before we get that formal presentation, now we have this host of friends that know what we're doing and how passionate we are to help the business. They can inform our presentation like, “Oh, don't say that. They won't get that. Say this.” And then, “Okay, great. Thank you.” “Yeah. You know, the way you described that the other day – that was perfect. But the way you described that last week, that was really hard for me to get. Let's find a better way to do that.” What I found, Alexey, when I do it this way – where I've had those frequent check-ins with domain experts or business leaders, once we get to the formal meeting, and someone's just got to ask the challenging question to show off, I'm not the one answering those questions. The people that I've been meeting with are defending everything I'm saying. But it's because I've cared enough to say, “I need your feedback. I need you to affirm that I'm on the right track to serve the needs of the business and to answer the questions that the domain expertise has.” Where they're weak in knowledge, I’m trying to get it more data informed. So it's this constant spirit of – do it frequently, but if you're doing it frequently and formally, that's going to kill you. Do it frequently and informally. Lunch and beer. That's my summary – lunch and beer.

Alexey: Okay, so basically the most important business skill for data professionals is networking. Being able to network with people, right? (55:10)

Thom: Networking... making friends. (55:18)

Alexey: Making friends. Okay. (55:21)

Thom: Yeah. [laughs] (55:23)

Alexey: It's like the Dale Carnegie book, right? (55:22)

Thom: Now, by the way, I wish I could say I learned all this from doing it super well. No. I'm learning a lot of it from painful hindsight. But I’m getting better at it myself. I'm a student of it. But it makes perfect sense, because I'm leveraging, Alexey, the wisdom we have from all of our beautiful arts in STEM. It’s powerful stuff. (55:25)

Lunch and beer during COVID

Alexey: So what about alternatives in today's world? There is a comment from somebody named D – it's not always an option in these ‘working from home’ days, when companies are now going fully remote. How do you have lunch with somebody who is even in a different location, but you have to work together? (55:49)

Thom: That's a great question. My best friend in the world right now is someone I've never met face-to-face. I'm an American-born… worse, I'm a Texas-born convert to Roman Catholicism, and my best friend in the world is a Syrian-born, Arab Muslim, who’s now working in Germany. And we're writing a book together. But we start all our conversations with, “I have a confession.” But then we talk about our work – and we spend, maybe more time, usually talking about our personal lives and things that are important to us. Fears we have. Struggles we have. And it helps a lot. But I think I don't have to drive to a lunch location and then pay the bill. I can say, “Hey. Is it okay if I eat while we talk? I encourage you to do the same.” I'm going to show this. (56:10)

Thom: One of my sons bought me this for my birthday recently, but he couldn't ship it from England. He said “Go buy it and I’ll Venmo you the money – what a deal. What I'm getting at is – I have these close friendships. More friendships now that COVID has forced us to be virtual like this. I don't even like calling it virtual. I think I would like to call it ‘electronic’ or something, or ‘video’. To me – yeah, it would be nice to be face-to-face with you, Alexey. But the time to travel to where you are, the expense – this is pretty cheap, relatively speaking. And yet, we're still here. You and I got to interact with that great group in Kenya together. That's how we first met. We're getting to do this now. Please, don't let this limit your informal meetings, to say “Hey, can you get on a quick Zoom or quick Google Meet?” I think it's possible. Get good at sending memes to each other. That builds camaraderie.

Alexey: Do you have a couple of more minutes? (58:31)

Thom: Absolutely. (58:34)

Integrated machine learning and AI community

Alexey: I wanted to talk about integrated machine learning and AI that you're doing. So, can you tell us more about this? You said that at some point you started doing this because you were getting a lot of requests. You were surprised that you were very helpful in the end. How did it start and what are you doing now? (58:36)

Thom: I have to confess that I can be pretty deaf sometimes. At my last three companies, it was obvious that people like to come to me for help. But I was still not getting it, that “Okay, you're a good mentor, Thom.” I like the way one of my daughters in India put it – again, self-appointed adoptions here, not illegal, just nonlegal –Manpreet Praja, she's been one of my biggest cheerleaders and she's growing rapidly herself now. She announced on a show, “It's not that Thom's only a great mentor. He's a great learner.” And I'm just a passionate learner, really. I'm always trying to improve the way I'm growing and learning. I share that and I want to hear how others grow and learn better. (58:58)

Thom: People that I helped, they would write really nice posts about me on LinkedIn and my followers went up. But also my direct connection requests and, you know, “Will you help me?” Then we would have a one-on-one call. I would do it – I would make time for it. But I took this adage (I feel like I'm really living this now), “Do what you can. Then do what's possible. And soon, you'll find that you're doing the impossible.” So from the very first time that I had to go from one-on-one calls to a weekly call-in, I said, “Look, there are a lot of people here. We need to create three things to make this work really well.”

Thom: First, you have to be brave to ask your question in front of others.” I might get this count wrong [laughs]. Then I'm going to answer, but I think everyone else should give their thoughts too, because that will be even better – and we'll probably all be helped by doing that. Well, it worked – from the very first question. I can't remember her last name at the moment, but Novena was the first one to bravely ask a question. Now we have people all over the world saying how much our family means to them, because it's a safe place to come to really air their fears, their concerns, when they're overwhelmed – how to deal with it. Because it's hard to be a data scientist. It's hard to get into this field. It's hard to grow in this field. And they emphasize it now too –we can do it better if we're more together.

Thom: The ‘more together’ spirit is, “I don't want to be the best. I want to be a best.” What do I mean by that? I'm getting good at something, Aleksey, and you want to get good at it too. I want to help you. Because if you get good at it, too – you're going to have perspectives and abilities in that that I might not have. Such that when you come up to my level, I could grow faster with you than I would have without you. Now imagine that there's a group or a family of people that feel that way. Let me ask you this. Would you like it if Denis Rothman considered you his brother?

Alexey: Maybe. [laughs] I’ve never thought about this. [laughs] (1:02:22)

Thom: Now, Denis Rothman is this brilliant AI mind. He's written many books with Pac. He decided to have a show recently – he got his LinkedIn live – he invited us from the family because, he said, “Well, you're my family now.” He was bringing up some really important topics that we hadn't even thought about before. We wouldn't have the ability to do that if we hadn't just had this ‘more together’ spirit. He got infected by it, because we invited him, we wanted him to talk about one of his books, we asked him questions. He loved it. He started just showing up on his own. But Dennis isn't the only one – we have these other fantastic family members who, this time last year, wouldn't have considered themselves data scientists. Now, they're already mentoring new people coming into the field. They're seeing the power it's having on them to have a ‘more together’ spirit and ‘a best’ spirit. It's infectious – because you grow. (1:02:25)

Alexey: Yeah. How can people join that? How can people do this? (1:03:36)

Thom: I will share a link that will always work. They can just join our Slack work group with this link. You can post this and I just ask that – if you would ask them to say “hi” in the family chat channel, I'll put that in there too. (1:03:42)

Alexey: Can you send this link to me afterwards in LinkedIn because what you send now may be lost? (1:04:03)

Thom: I'll send it to your LinkedIn, too. (1:04:10)

Alexey: I'll add the description and anyone who is interested – you will find this in the description. Probably not right now, in a couple of hours. I'll put it there. So, if you want to get the link now… maybe, you know what? I can actually send it to the live chat. (1:04:13)

Thom: I can put it in LinkedIn later but I think if you capture it right now. (1:04:32)

Alexey: I will just put it in the live chat, but it will be gone later. (1:04:38)

Thom: Yeah, you can save the chat or you can just copy and paste it now. But let me know if you need it over LinkedIn messaging. And to everyone listening, please follow me on LinkedIn. I've put my profile there real quick. I should have it memorized, but it's easier to copy and paste it anyway. (1:04:43)

Alexey: I found the link. I have in my history because I was chatting with you just now. (1:05:14)

Thom: I'm almost there anyway. Oh, it's being... (1:05:22)

Alexey: I found it. (1:05:24)

Thom: Okay, good. Yeah, it's just the normal stuff. And then Thom Ives – all together. (1:05:25)

Alexey: Yeah, but I think your name is pretty… How do I say it? Look-up-able. (1:05:33)

Thom: Yeah. Findable. (1:05:41)

Alexey: Findable, yes. Okay. (1:05:42)

Thom: Yes. But I think look-up-able should be a word. That's it. (1:05:43)

Alexey: Yeah. Well, it is from now on, right? [laughs] (1:05:47)

Thom: There you go. I did put it in our chat. (1:05:52)

Wrapping up

Alexey: Okay. Thanks a lot, Thom, for joining us today and for sharing your experience with us – your knowledge, your stories. And thanks, everyone for joining us today as well, for being here, for asking questions. Do you want to say anything before we finish? (1:05:55)

Thom: It was an honor to be here and I really enjoyed our discussion. I know I was doing most of the talking, but I really did appreciate your questions. It was helpful. (1:06:16)

Alexey: Yeah, that's the idea behind inviting people, you know, – that they talk most of the time. [laughs] (1:06:25)

Thom: Of course. But we would love to have you come visit our family and give a talk about what you do. That would be awesome for us. (1:06:32)

DataTalks.Club