AI Dev Tools Zoomcamp: Learn AI-powered coding assistants and agents Register here!

Season 21, Episode 1

Synthetic Medical Imaging Data for AI: Startup Data Engineering, MVPs & Freelance Transition | Orell Garten

Listen to or watch on your favorite platform

How do you turn simulation research into usable synthetic medical imaging data for AI, build a minimal viable data pipeline, and pivot into freelance consulting? In this episode, Orell Garten — an electrical engineer trained in simulation algorithms who left a PhD during COVID and explored productization through a government-funded startup program — walks through that journey. We cover his simulation work in RF and wave propagation, the startup pivot to synthetic medical imaging data for AI, and the go-to-market lesson of problem-first versus technology-first.

Listen for practical data engineering guidance: minimal viable data work, simulation–HPC integration, secure data management, and an MVP workflow built on manual extraction, CSVs, and local analysis. Orell also discusses scientific-method product discovery, preventing overengineering with weekly feedback, and tool choices (Python, C++, DBT, Docker, DuckDB). He explains launching a freelance practice via LinkedIn, prototype delivery for IoT data engineering, client acquisition, and managing runway and cashflow. If you’re building synthetic data pipelines, medical imaging datasets, or transitioning to freelance data engineering, this episode delivers concrete tactics, risks to plan for, and hands-on techniques you can apply immediately.

Links:

Orell Garten

About this Guest

Orell Garten

Orell graduated in 2018 with a degree in Electrical Engineering, where he focused on simulation algorithms. He then began a PhD, but when COVID arrived, he decided to leave academia and look for new challenges. After that, he joined a government-funded startup program to explore how to turn scientific research into real products. Although that project didn’t go as planned, it taught him what he loves most: applying rigorous simulation methods to solve practical problems.

LinkedIn X Orell Garten on DataTalks.Club

You'll be subscribed to our newsletter and receive a Slack invite in 3 minutes.

Click any timestamp to jump to that moment in the video

The transcripts are edited for clarity, sometimes with AI. If you notice any incorrect information, let us know.

Episode Introduction & Overview

Alexey: This week, we'll talk about many different things. We will discuss our guest’s career, simulation algorithms, and production-grade data systems. (0.0)

Alexey: We have a special guest today, Orell. Orell graduated in 2018 with a degree in electrical engineering, focusing on simulation algorithms. Then he pursued a PhD. (0.0)

Career Background: Electrical Engineering and Simulation Algorithms

Alexey: When COVID arrived, he decided to leave academia and look for new challenges. After that, he joined a startup program to explore how to turn scientific research into real products. He learned many things. Today, he is a freelancer in software and data engineering. We will talk about how he uses his deep experience in simulation to build custom data infrastructure. (2:19)

Alexey: Before we start, I want to thank Valeria, who prepared the questions for today’s interview. Normally Johanna does that, but she has been busy with summer school. So thank you, Valeria, for your help. Welcome, Orell! Let’s start the interview. (2:45)

Orell: Hi Alexey, thanks for having me. I'm excited to be here. (3:10)

Transition Out of Academia During COVID

Alexey: My pleasure. Before we dive into building custom data infrastructures, let’s talk about your career. I briefly outlined it, but maybe you can share more details about your career journey so far. (3:16)

Orell: Sure. In my final years of high school, I was unsure what to pursue. I knew I wanted to do something technical like most of us did. (3:34)

Alexey: Is it very hard in Germany? When do you graduate, at 17 or later? (3:46)

Orell: I graduated at 18, I was actually 19 at the time, but still very young. I really had no idea what I wanted to do. (3:53)

Alexey: I’m originally from Russia, and there we graduate at around 17. When you’re 17, you have no idea what you’re going to have for breakfast, let alone your future career. It’s difficult to decide. (3:58)

Orell: Exactly. You also don’t know what the daily work looks like. You might think about engineering or computer science but have no idea what the job entails. I coded a bit in my spare time and decided to study electrical engineering because people said, “If you want to code, go to electrical engineering. For theoretical computer science, go to computer science.” I also dabbled in embedded systems with microcontrollers back then. (4:09)

Simulation Research: RF and Wave Propagation Modeling

Orell: During my studies, I realized that coding and building systems is what I enjoy. Then I focused on simulation engineering, developing custom simulation algorithms for problems like RF wave propagation. (4:42)

Alexey: I remember we studied simulations too, like simulating a supermarket to optimize customer flow and queuing strategies. For example, deciding whether to have a queue per cashier or a single queue. The optimal strategy is usually one queue. Was that similar to what you studied? (5:09)

Orell: It’s a bit different. Simulation can mean many things. What I worked on was wave propagation audio waves or electromagnetic waves used in radar or mobile communications. (5:58)

Orell: We used physical equations from physics theory to simulate how waves behave in real-life scenarios. This is quite a hard problem because it takes a very long time, especially for large scenarios like wave propagation in a city where waves bounce off buildings and interact with the environment. It was challenging because it involved physics, math, and coding. You have to optimize your code, data formats, and strategies, which made it quite interesting. (6:16)

Alexey: Interesting. But you are not doing this work anymore, right? (7:17)

Orell: No, after my studies I started a PhD but quit when I realized it wasn’t what I wanted to do, even within electrical engineering and simulation. Then COVID hit, and a friend and I tried to start a startup to monetize simulations and related systems. (7:24)

Orell: We didn’t succeed, but we learned a lot about data engineering, economics, startups, talking to customers, and understanding their needs. When we stopped working on the startup, I decided I didn’t want to do a 9-to-5 job. I wanted to work for myself and experiment with different opportunities. So freelancing came naturally. (8:00)

Alexey: Can you tell us more about the startup? What exactly were you trying to build? (8:44)

Startup Pivot: Synthetic Medical Imaging Data for AI

Orell: We had a different problem. We focused on medical imaging like MRI and X-rays. We wanted to develop AI algorithms to analyze the images because it’s repetitive and boring work. (9:04)

Orell: We simulated the physics of imaging machines and processes to create synthetic data for AI training. That’s what we tried to monetize. (9:04)

Go-to-Market Lesson: Problem-First vs Technology-First

Orell: However, it didn’t work out for several reasons. One main issue was we started with technology, not a problem. Later, we realized it wasn’t seen as a problem at the time, especially before chat GPT and generative AI became popular. (9:42)

Orell: Medical companies often work only when hospitals specifically request certain MRI functionalities. There is little in-house research, so monetizing the technology was very difficult. (10:06)

Alexey: I had a funny experience with MRI. I had back problems, and doctors found something concerning on the MRI, but I did not feel constant pain. They said I should be in pain all the time according to the scan, but I wasn’t. (10:32)

Orell: Sometimes the body works in surprising ways. (11:01)

Alexey: Maybe the body compensates. Perhaps it was an old injury that showed on the MRI but no longer caused pain because the body adapted. (11:07)

Orell: That could be it, but I’m not a medical professional. (11:21)

Alexey: You learned a few things about MRI though? (11:26)

Orell: Yes, about MRI and also the medical market, especially in Germany. The medical market is capped financially; the money available depends on what people pay into the system. (11:32)

Orell: If you make money, you take it from someone else in the market, so it’s not a win-win situation. Someone has to lose for you to make money, which is not ideal for startups. (11:44)

Alexey: In the ideal world, there is a finite amount of money, assuming governments are not printing money. So you are always competing for funds. If someone spends on your product, they can’t spend it on another. (12:10)

Orell: That is true to some extent, but I believe governments do print money. Also, money can move around. Spending on one thing can free up spending elsewhere, so some win-win possibilities exist. (12:30)

Orell: Though someone often loses, it helps if your competitors don’t feel threatened. Then they might be more willing to help you. (12:56)

Early Data Engineering Practice: Minimal Viable Data Work

Alexey: What kind of things did you build for the startup? You said you learned about data engineering, economics, and talking to customers. What did you actually do regarding data engineering? Did you need a platform? (13:20)

Orell: Yes, but we started small. In startups, you don’t know what people need at first. Many bigger companies are also unsure. (13:45)

Orell: It’s important to do the minimum work needed to make the data work at the moment. (13:45)

Orell: For example, if you have no customers, it doesn’t matter if there is a data pipeline since it is not used. (13:45)

Simulation-HPC Integration and Secure Data Management

Orell: Sometimes doing things manually, like triggering pipelines, gives you more control early on. (14:21)

Orell: We had simulation algorithms, which could be considered AI or machine learning kernels, and data infrastructure to move data to simulations running on high-performance clusters, then retrieve results. (14:21)

Orell: It’s crucial to manage data properly to avoid mixing client data, especially when clients might be competitors. (14:55)

Orell: That is data management, not magic. Keeping it simple is the best approach in startups. (14:55)

Alexey: Was this something you had to learn, or did you already know it from your PhD? (15:26)

Alexey: Orell (15:26)

We had to learn it. In the PhD, it was all about science and algorithms, not implementation. It was about trade-offs in equations, not practical code.: That’s one reason I left academia; nobody cared about the implementation or what the work could actually do. It was mostly about formatting the equations correctly. (15:34)

Iteration Differences: Academia vs. Startup Timelines

Alexey: I assume the iteration cycles you have in academia differ a lot from startups. (16:05)

Alexey: In startups, you test ideas quickly, build MVPs fast to satisfy customer needs, whereas in academia you might have a five-year grant to develop a specific thing. (16:05)

Orell: That’s true. Academia has longer project timelines. Usually, the project has a specific goal funded by a grant. (16:39)

Orell: However, the overall process isn’t that different because in startups you also test hypotheses, but about markets and customer problems. (16:59)

Orell: In academia, the value is in publishing papers. In startups, it’s selling products or services. Very similar processes but with different definitions of success. (16:59)

Alexey: Can you tell us more about the scientific approach? How do you formulate hypotheses, and how do you use that as a data engineer? (17:50)

Scientific Method in Product Discovery and Hypothesis Testing

Orell: In startups, you ask specific, narrow questions. (17:55)

Orell: For example, “Is this a problem for companies with fewer than 10 employees?” You must specify the problem clearly.Then you talk to as many companies as needed to validate or invalidate your hypothesis. (17:55)

Orell: Data engineering is more complicated because it is behind the scenes. You only notice it when something breaks. There are no dashboards or visible results. It’s a lot of plumbing to make data flow. (18:40)

Orell: You have to talk to many people to understand their problems, but it is harder to find out what companies truly need because each is very different. (18:40)

Freelance Launch: From CTO Role to Consulting via LinkedIn

Alexey: That’s very interesting. I am noting what companies need because I want to come back to that later. Before that, I want to talk about your transition from startup builder to freelancing data engineer. You enjoyed data engineering in the startup and wanted to focus on it independently. (19:34)

Alexey: Can you tell us how that happened and how you started freelancing? (19:34)

Orell: It happened naturally. Near the end of our startup journey, I received a LinkedIn request from someone wanting to pay me for consulting, which was very exciting. (20:20)

Alexey: I just wonder, you did a startup as a co-founder. What was your role were you the CTO? (20:39)

Orell: Yes, mostly like CTO. At that stage, roles weren’t well-defined everyone did everything. (20:47)

Alexey: How did your experience as a co-founder translate to the customer request you received? Did you highlight your data platform skills? What exactly did the customer ask for? For others hoping to get LinkedIn leads, how did you make it happen? (20:59)

Orell: It happened through a contact from the startup days. We’d discussed their problems earlier, and later they came back and asked if I could help. It was a small thing, maybe a week of work, but it was a big deal to have someone pay for my skills. Even after a failed startup, my skills had value. That helped me see freelancing could work. From there, I focused on doing the best work, networking, and sharing on LinkedIn. Building a network made it easier to reconnect and get referrals, especially once COVID eased. Of course, you need to deliver quality work, but that’s usually not the main problem. (21:24)

Prototype Delivery: IoT Data Engineering Proof of Concept

Alexey: You mentioned doing customer interviews and then closing your business, but this client still had a problem and asked you to help. What was the problem, if you can share? (22:59)

Orell: It was a prototype for a data engineering IoT project. That’s about all I can share. (23:29)

Alexey: How did you have the needed skills? Was it from your startup? (23:42)

Orell: Yes, it was close. After the simulations didn’t work out, we tried other things. My electrical engineering background showed I could handle embedded systems. We tried different cloud platforms. I learned a lot that year before I got the request. But the biggest thing is system thinking, not just technology choices. AWS, GCP, or Azure usually doesn’t matter as much as design and making the right trade-offs. (23:48)

Alexey: Did that project actually take just one week? (24:55)

Orell: Yes, about a week. We delivered the prototype and then the client decided where to go next. It fizzled out, maybe the solution wasn’t what they needed, but that’s not really my story to share. (25:02)

Alexey: So for you the main thing was seeing that your skills were in demand and people would pay for them. What happened next? (25:20)

Freelance Risks: Runway, Cashflow, and Operating Expenses

Orell: Then came a tough time: a three or four month drought with very little project work. I barely covered my bills. The market in Germany soured: projects disappeared or got cancelled, budgets were frozen. That just happens sometimes. (25:33)

Alexey: Many companies went bankrupt during this time. (26:26)

Orell: Yes, that too. I also didn’t set a clear deadline for myself like, give it a year to decide if freelancing was viable. I just kept going, which in hindsight was risky. You can’t do it until the money runs out; you need a buffer. (26:33)

Alexey: If you run out of money, you’d be living under a bridge. Germany has a good safety net, but not everywhere does. (26:58)

Orell: That’s why you should start lean and keep fixed costs down, so your savings last. For any startup, you want at least six months of operating expenses in the bank. (27:12)

Alexey: What expenses do freelancers have, besides the bank account? (27:37)

Orell: Tax accountant, hardware, software subscriptions, occasional travel for customer meetings, perhaps platform accounts. Plus, you often have to wait 30–45 days for payment after invoicing, so even in the best-case scenario, you go weeks without getting paid. You still need to pay rent, food, and living costs at that time. (27:43)

Alexey: Is Leipzig getting more expensive now? (28:54)

Orell: Yes, though it’s still cheaper than Berlin or Munich. But prices are up, there’s little housing and lots of demand, a typical problem in Germany. (29:00)

Alexey: How much runway do I need before freelancing six months? (29:44)

Orell: Six months is a good baseline, longer is better, but six months to a year is reasonable. Sometimes you get lucky and are fully booked immediately, but that’s rare unless you have a giant network. (29:50)

Alexey: How did you solve this for yourself? (30:39)

Client Acquisition: Networking, Recruiters, and Referrals

Orell: Networking. I mentioned that I’m self-employed whenever it fits in business and tech contexts. If people don’t know you’re available, they can’t reach out. (30:50)

Alexey: After the first project, months went by before you found more work. You said networking helped long term. What else worked in the beginning? (32:21)

Orell: At first, I applied for freelance projects via recruiters. It's common in Germany, less so elsewhere. That’s how I got my biggest client last year. Once you land work, more follows, and momentum builds. If you have nothing, it gets harder and rates can be pressured down. (32:47)

Specialization: Industrial Data Integration and Custom ETL

Alexey: What do you actually do as a freelancer? Data engineering is broad do you specialize? (34:22)

Orell: I focus on the software side of data engineering, not dashboarding or PowerBI. I build data pipelines, preparing and managing data so it’s ready for analytics or warehousing. Most of my clients are industrial with many types of machines and formats, even different variants from the same manufacturer. I do a lot of custom integration and transformation. (34:35)

Alexey: Do clients know what they want, like a specific dashboard, and you help make that data available, or do you help them discover what they need? (35:49)

Orell: It depends. Some clients come with specific ideas and I help implement them, sometimes consulting on what else makes sense. Other clients only know they have data and want analysis, so we begin with exploration. It often isn’t clean or standard. Sometimes a CSV is enough for a basic first step, showing what’s possible and surfacing problems. The key is always understanding their business goals and providing real value. (36:24)

Alexey: On your LinkedIn, you say you help clients start small, focus on immediate value, and then scale. Can you walk us through an example, even a hypothetical one? (38:14)

MVP Workflow: Manual Extraction, CSVs, and Local Analysis

Orell: The first thing I always do is look at the data: what’s in it and what does the schema look like? Usually, even providing documentation about the data brings immediate value for companies. (39:00)

Alexey: I imagine, based on what we've been talking about, you have some equipment sitting in a factory some industrial machine. The goal is to understand if the machine is performing correctly, but you have no idea what kind of sensors it has, what data it's sending, what the schema is, or if it's binary or JSON. I'm just making this up; I have no idea how these machines work. But that's what you need to figure out, right? (39:19)

Orell: That would be the worst case if you have no idea about the data situation at all. Usually, you have some sense of what sensors are installed and what the values mean, but often the documentation doesn’t match the actual data. That’s the first challenge. You also want to know things like, "We get an error every 12 hours on that machine—what's happening?" So you try to find patterns or problems you can identify. Before doing any automation, you have to understand what the data looks like, how different machines work together in the same manufacturing line, that kind of thing. (39:56)

Orell: A lot of machines now are intelligent or at least connected, so their data is usually available. Sometimes you need custom vendor-provided software to decode it, but often it’s just log files, terminal outputs, or JSON containing sensor values. There are different technologies REST APIs, MQTT, sometimes even Kafka so it’s all over the place. That makes it interesting, but also challenging, and requires quite a bit of infrastructure to integrate all these different data processes. (40:53)

Alexey: But you probably don't start with the infrastructure part. You first figure out what kind of data there is, how to access it, how to get the data, and then you build the first MVP, right? You put the data into a CSV file and say, "Here's a dashboard I can build. Does this work for you?" Is this what you do? (41:54)

Orell: Yeah, basically. It might sound really simple, but it works pretty well. You just pull files for one day, or depending on how big they are, for a set period onto your computer, and then analyze them locally. Then you can say, "Okay, on this day, in this hour, we see this problem or this insight." You start building scripts that transform the data and make sense of it. Once that's done, you move to a more automated approach, where you ingest the data automatically and process it on a schedule or with stream processing, depending on the needs. (42:16)

Alexey: So this is when you start building the infrastructure part. (42:58)

Orell: Yes, exactly. If you start investing in infrastructure before you know what you want to do, your infrastructure is probably not what you actually need and is likely overengineered. You end up supporting all possible use cases, but realistically, you only need a few. (42:58)

Preventing Overengineering: Weekly Feedback and Iteration

Alexey: How do you keep yourself in check and avoid overengineering when it's not necessary? (43:27)

Orell: Good question, it's a tough one. You have to keep reminding yourself to keep things simple. Also, regular meetings help. They don’t have to be daily or even weekly, but having a tight feedback loop with your clients keeps things simple. With weekly meetings, you can't implement overly complex solutions all at once. If you show or talk about what you did and the results every week, you’re forced to keep it simple and deliver each week. Then, when necessary, you can gradually increase complexity. (43:34)

Alexey: Isn't that the idea behind Scrum with these demo days or whatever they're called where you're forced to focus on a demo, and to build a demo, you need a minimal working pipeline? Then you do the demo, adjust as needed, and iterate. (44:17)

Orell: Yeah, that’s what Scrum or Agile would look like in an ideal world, but a lot of companies use the terminology without really following the practice it ends up as micromanagement. For me, it’s about working with clients to find the way that works best. No one is forcing me to be Agile or do Scrum; it’s about figuring out the best approach together. I can’t just disappear for 6 months and come back with the perfect solution, which might not even be needed. That would be expensive and bad for my reputation. (44:50)

Alexey: I guess your startup experience helped make you more pragmatic and focused. (45:51)

Orell: Yes, definitely. That was one thing I learned in the startup: be pragmatic, keep it simple, deliver quickly, and go from there. (45:58)

Alexey: If you work at a company, you always know when the next paycheck comes, so you’re more relaxed and might want to engineer the perfect solution. But at a startup you don’t have a year you have next month, and if you don’t do something, you’re out. (46:10)

Orell: Yeah, you have to think like a businessperson, not just an engineer. As an engineer, you want to build all the cool things and new tech because it's fun and a learning experience, but it doesn’t always deliver value. Even as a full-time employee, you might have more time for experiments and demos and get paid for it but as a freelancer, you have to experiment and try new tech on your own time. (46:43)

Alexey: How do you set aside time to grow as an engineer and play with new technologies? (47:32)

Orell: I block time in my calendar Friday afternoon, the weekend, or whenever fits my life. You have to make time for this, otherwise it doesn’t happen. Even when you block time, sometimes other things are more important. Keeping up with developments by following people on LinkedIn, being part of communities, and listening to podcasts helps you stay current, even if you don’t know all the details. So when you encounter a problem where a new tech might help, you remember it and can decide if it’s worth exploring. (47:38)

Alexey: So you need a broad overview of what’s possible in your domain. (48:48)

Orell: Yes, it helps you make better decisions and know your options, even if you don’t know the details. You’ll often fall back on what you know best, and that’s fine. (48:54)

Alexey: How do you structure this learning time, considering you also run a community and always have things to do? (49:22)

Continuous Learning: Practical Experiments and DuckDB

Alexey: If you hear about new tech like DuckDB which some say is super cool, lets you do everything locally, and can be faster than Spark how do you find time to try it and really learn, so you might use it one day? (49:59)

Orell: DuckDB is awesome, by the way! What I do is start with the 'get started' tutorial there’s almost always one. It teaches you the basic ideas and the way to think about technology. The easier it is to try, the easier to get hands-on. With DuckDB, you only need a computer. No server needed, they even have example data. So getting started is very easy. Sometimes I struggle with what to build, but tutorials usually give ideas. Then if I see a problem that fits, I’ll try that technology. Sometimes, I solve problems I’ve solved before using the new tool, comparing approaches, benchmarking, code length, whatever. That’s a good way to learn, and it prevents you from being stuck thinking about what new thing to code. (50:33)

Alexey: If you want to learn, how do you pick a project? It's tough sometimes the customer problem matches a new tech, and you can use it. But sometimes you might not use the best tool. (52:25)

Orell: That’s always how it is. It's better to have something that works even if it's not perfect than to wait for the perfect solution that might not exist for years. You can always swap tools later, though it's not fun. If the tool works, it’s probably good enough, unless you’re in the top 5% for performance. (53:03)

LLMs for Data Cleaning: Domain Knowledge Limitations

Alexey: We have questions. Even though you don’t work with AI/ML, have you come across LLMs used for data cleaning or ingestion? (53:42)

Orell: I see the idea often, but the main problem with data cleaning is domain knowledge. For an industrial company, you need to deeply understand what the values mean data cleaning always changes the data, so it’s not raw anymore. You have to know what the data means for the business, and that’s tough for an LLM. I usually talk to clients for hours to really get it. Maybe an LLM could flag things that look out of place, but I don’t think it’s the best solution right now. (53:54)

Alexey: What was the biggest challenge about going freelance? (55:41)

Orell: Acquiring clients. If you solve that, freelancing can be flexible and rewarding but if you haven’t solved it, it’s painful. Most people already have the skills from full-time work, but client acquisition is a different skill set. (55:46)

Alexey: Are you an introvert or an extrovert? (57:43)

Orell: I’m an introvert. I can do networking but find it exhausting. After networking events, I need downtime. But you have to do it, or you won’t get clients. (57:47)

Tech Stack & Systems Thinking: Python, C++, DBT, Docker

Alexey: What is your daily tech stack or main skills? (58:29)

Orell: My main tools are Python and C++ though Python is 90% of my work now. C++ is mostly for industrial hardware or high-performance computing from my simulation days. Recently, I use DuckDB a lot for both prototyping and actual pipelines it's flexible and integrates well with Python. I use Linux, DBT, Docker depends on the client. Software engineering skills help, but my main focus is problem-solving, not the specific tech stack. The language is just a preference; the real skill is figuring out the problem and delivering a suitable solution. (58:37)

Manual Data Exploration: Handling Edge Cases Before Automation

Alexey: So the main skill in your working life is problem solving? (1:00:53)

Orell: Yes. The tech part comes later. First, figure out the real problem, then decide the best or a good enough solution. Sometimes, manual work is fastest for the first iteration just filtering or classifying data by hand. This helps you learn about the data and its edge cases, which are hard to code for. That’s where experience and intuition are valuable, and maybe where LLMs could help in the future. (1:00:59)

Alexey: For me, with a data science background, manual exploration is super useful. It might sound boring, but it's worth it for learning the data. (1:02:20)

Orell: Absolutely. Thanks for having me. I hope my story motivates some people to try freelancing. I enjoyed the conversation. (1:02:43)

Closing Remarks and Freelancing Advice

Alexey: Thanks to Orell and everyone for joining. Don’t forget we have more events coming up. If you have guest suggestions, let me know! Enjoy the summer and see you next time. (1:03:31)

DataTalks.Club. Hosted on GitHub Pages. We use cookies.