Dream job Data Science – Part II: Insights into the everyday life of a Data Scientist
Part two of our interview with Dr. Dennis Müller
Data Scientist – one of the hottest jobs at the moment. But how does one actually become a Data Scientist and how can one imagine the daily operative business of a data specialist? We talked to Dr. Dennis Müller, Data Scientist at Kenbun IT AG, about how it is like to be a Data Scientist.
In part two we will get into the action and take a closer look at the daily work of a Data Scientist. Which tasks does he /she have to cope with? Which programs / tools does he/she use? And what does an ideal working environment for a Data Scientist look like?
How does your working day at Kenbun look like?
I get up in the morning and cycle to the technology hub, in the technology region of Karlsruhe. In an inconspicuous-looking building in the Oststadt there is the Technology Factory – the birthplace and headquarter of many innovative companies, including Kenbun IT AG. This is where my workplace is located. On the table a modern computer, connected to an even more modern monitor. We work with the latest technology: With cloud computers with an appropriate GPU power, powered by NVIDIA. Depending on the project, I manage a model training, analyse data or prepare a workshop.
Which programs do you mainly work with? Which of your programming skills do you use the most?
The skill I probably use the most is programming in a scripting language like Python or R. The Jupyter Ecosystem, like Lab or notebooks, is also helpful, because it has the advantage not only to manipulate data or to train models, but also to provide the documentation of the procedure with corresponding visualizations. Depending on the requirements PyCharm or Spyder are also suitable for programming. For the handling of deep learning based models basic knowledge of Tensorflow and Pytorch is indispensable. For the transformation of structured data the Python library Pandas is the first place to go. In order to not enumerate other libraries, I refer to one of the many “Data Science with Python” books here.
For creating presentations I usually work with PowerPoint, less often with LaTex.
Since many of our model trainings are remote, I updated my knowledge about the Linux shell, e.g. to automate recurring work steps. Finally, it is important for me to follow current developments with new papers, which you can find e.g. with Google Schoolar or arxiv-sanity.com.
Why did you decide to work at a start up? What do you think are the advantages compared to working for a corporation?
Working at a start-up is extremely attractive because you get a complete overview of all process steps within the company. From the first contact with a customer to the commissioning of a model – you are always close to all projects and can participate. This makes my work very versatile and varied.
I like working at Kenbun because I am not only an expert in my field, but also a provider of ideas for certain problems in the entire data science environment. I have the opportunity to actively participate in many areas and thus create real value for the company.
Kenbun recently released its first own product – an AI platform called Kidan. You as a Data Scientist are ultimately the person who has to work with such a platform. Now we wanted to know why it is so important for the requirements of a Data Scientist to have a suitable AI platform. Does it make sense for me as an employer to invest in such a platform?
Yes, definitely. Because good AI platforms simplify the work of Data Scientists and save a lot of time.
You can imagine it as follows: The less time you have to spend on connecting the data to the analytics platform or the respective interface, the more you can take care of the effective modeling of the problem. If, for example, you first have to find out in which database the required information is located, in which format it is stored, whether it has been used consistently, or whether backup copies are available, a lot of time is lost. The faster the actual value-adding activity can be started, the faster the Data Science Lifecycle can be iterated – leading to better models and deeper insights into the problem class and the ultimate solution. And in the end, as a Data Scientist, you don’t have to worry about delivering the model, because good AI platforms also support this step of the data science lifecycle.
Many thanks Dr. Dennis Müller.