As an experienced ML developer who has hired numerous junior engineers for various projects, I have realized that there are certain essential skills that junior developers must possess in order to be considered for a job in the field. While the required skills may vary depending on the project and company, there are some fundamental skills that are universally required. In this article, we will discuss the key skills that junior ML developers should have to succeed in their job search. By the end of this article, you will have a better understanding of the necessary skills for junior ML developers to land their first job.
What skills do most junior developers who apply for a job have?
Junior developers seeking their first job often come from different fields, having completed some ML courses. They have a basic understanding of ML but lack a strong background in engineering, computer science, or mathematics. While a math degree is not mandatory for becoming a programmer, it is highly recommended in the field of ML. Machine learning and data science involve experimentation, fine-tuning of algorithms, and even creating new ones, all of which require some mathematical knowledge. College students with a good degree have an advantage here. However, although they may have deeper technical knowledge than the average junior without specialized education, they often lack the practical skills and experience necessary for a job. College education focuses on providing fundamental knowledge, often neglecting marketable skills. Most applicants for junior ML engineer positions are familiar with SQL, vector embeddings, and basic time series analysis algorithms. They may also have used Python libraries like Scikit-learn and applied problem-solving techniques and algorithms such as clustering, regression, and random forests. However, these skills are not sufficient.
What skills do popular courses not provide?
As you now understand, most educational programs fail to provide hands-on experience and a deep understanding of the subject matter. If you are determined to build a career in ML, you must learn additional skills on your own to make yourself more marketable. If you are not willing to learn, I must say, don’t bother—the days when anyone could easily land a career in IT are gone. Today, the market is highly competitive. One of the skills that popular courses often fail to provide a deep enough understanding of is random forests, including pruning, selecting the number of trees/features, etc. While courses may cover the basics of how random forests work and how to implement them, they may not delve into important details or discuss more advanced ensembling methods. These details are crucial for building effective models and optimizing performance. Another often overlooked skill is web scraping. Data collection from the web is a common task in many ML projects, but it requires knowledge of tools and techniques for scraping data from websites. Popular courses may briefly touch on this topic, but they may not provide enough hands-on experience to truly master this skill. In addition to technical skills, junior ML developers also need to know how to effectively present their solutions. This includes creating user-friendly interfaces and deploying models to production environments. For example, Flask in conjunction with NGrok provides a powerful tool for creating web interfaces for ML models, but many courses do not cover these at all. Another important skill that is often overlooked is Docker. Docker is a containerization tool that allows developers to easily package and deploy applications. Understanding how to use Docker can be valuable for deploying ML models to production environments and scaling applications. Virtual environments are also important for managing dependencies and isolating projects. While many courses may briefly cover virtual environments, they may not provide enough hands-on experience for junior developers to truly understand their importance. GitHub is an essential tool for version control and collaboration in software development, including ML projects. However, many junior developers may only have a surface-level understanding of GitHub and may not know how to effectively use it for managing ML projects. Finally, ML tracking systems such as Weights and Biases or MLFlow can help developers keep track of model performance and experiment results. These systems can be valuable for optimizing models and improving performance, but they may not be covered in depth in many courses. By mastering these skills, junior developers can distinguish themselves from the competition and become valuable assets to any ML team.
What do you need to get an ML engineering job?
Young professionals often face a dilemma: to get a job, they need experience. But how can they gain experience if no one wants to hire them? Fortunately, in ML and programming in general, this problem can be solved by creating pet projects. These projects allow you to showcase your programming skills, ML knowledge, and motivation to potential employers. Here are some ideas for pet projects that I, honestly, would like to see more often among job applicants for my department:
- Web scraping project: The goal of this project is to scrape data from a specific website and store it in a database. The data can be used for various purposes, such as analysis or machine learning. The project can involve using libraries like BeautifulSoup or Scrapy for web scraping and SQLite or MySQL for database storage. Additionally, integration with Google Drive or other cloud services for backup and easy access to the data can be included.
- NLP project: Build a chatbot that can understand and respond to natural language queries. The chatbot can be integrated with additional functionality, such as maps integration, to provide more useful responses. Libraries like NLTK or spaCy can be used for natural language processing, and TensorFlow or PyTorch can be used for building the model.
- CV project: Build a computer vision model that can detect objects in images. It is not necessary to use the most sophisticated models; demonstrating skills with basics of deep learning like U-net or YOLO is sufficient. The project can include uploading an image to a website using ngrok or a similar tool and returning the image with detected objects highlighted in squares.
- Sound project: Build a text-to-speech model that can convert recorded audio into text. Train the model using deep learning algorithms like LSTM or GRU. Libraries like PyDub or librosa can be used for audio processing, and TensorFlow or PyTorch can be used for building the model.
- Time series prediction project: Build a model that can predict future values based on past data. Libraries like Pandas or NumPy can be used for data manipulation, and scikit-learn or TensorFlow can be used for building the model. The data can be sourced from various places, such as stock market data or weather data, and can be integrated with web scraping tools to automate data collection.
What else?
Having a strong portfolio that showcases your skills is as valuable, if not more valuable, than a degree from a renowned university. However, there are other skills that are important for anyone in today’s world: soft skills. Developing soft skills is important for an ML engineer as it helps them communicate complex technical concepts to non-technical stakeholders, collaborate effectively with team members, and build strong relationships with clients and customers. Some ways to develop soft skills include:
- Creating a blog: Writing about technical concepts in a clear and concise manner can help you become better at communication and structuring your thoughts. It also helps you grasp how to explain complex tasks to different audiences.
- Speaking at conferences and meetups: Presenting at conferences can improve your public speaking skills and teach you how to tailor your message to different audiences.
- Training to explain concepts to non-technical individuals: Practicing explaining technical concepts in simple terms can enhance your ability to communicate with a wide range of people.