As a professional who works with data, I understand the importance of being efficient and accurate in the workplace. That’s why I believe mastering the command line is an essential skill for streamlining data analysis tasks and improving productivity. It’s equally important for regular users who want to optimize their operating system usage and automate various tasks.
In this blog, we will review a popular (144k ?) one-page guide available on GitHub. The guide is designed to equip you with essential command-line skills that can enhance your workflow.
The Command Line (CLI), also known as the terminal or console, is a text-based interface that allows users to interact with a computer’s operating system through the use of typed commands. It offers an alternative to graphical user interfaces (GUIs) and provides a more direct and precise way to access and manipulate files, directories, and system resources.
Screenshot by Author
Users can enter commands in a terminal that allows users to perform tasks with precision and automation, such as scripting, software development, data processing, and system administration. The terminal enables users to execute multiple complex operations with just one command.
Mastering the art of the command line is a journey that can significantly enhance your productivity and understanding of your computer system. Whether you’re a beginner or an experienced user, the command line offers a powerful way to navigate, customize, and automate tasks on your computer.
It is particularly beneficial for data scientists. Through the command line, data professionals can streamline data cleaning, execute data pipelines, automate data-related tasks, and use various command line tools for testing and model development.
Screenshot from jlevy/the-art-of-command-line
This guide aims to provide essential command-line knowledge in one page, with a focus on Linux but also including tools for macOS and Windows users. It covers basic commands, processing files and data, system debugging, and commands that are only available on Mac and Windows. The guide is available in multiple languages, thanks to the contributions of various authors and translators.
Languages: Čeština ∙ Deutsch ∙ Ελληνικά ∙ English ∙ Español ∙ Français ∙ Indonesia ∙ Italiano ∙ 日本語 ∙ 한국어 ∙ polski ∙ Português ∙ Română ∙ Русский ∙ Slovenščina ∙ Українська ∙ 简体中文 ∙ 繁體中文
The scope of this guide is broad yet concise, aiming to cover everything important, provide specific examples, and avoid unnecessary details. It’s designed for interactive Bash use, but many tips apply to other shells and Bash scripting as well.
Basics
It’s essential to learn basic Bash commands and understand their documentation `man <command>` and a master at least one text-based editor (e.g. Vim, Emacs, nano) for efficient terminal-based editing. Additionally, it’s important to learn about file and output manipulation, including redirection (>, <, |), and file globbing.
Everyday Use
For efficient command completion and history, use Tab and Ctrl-R, respectively. To navigate and manage files, understand directory navigation using ls, cd , ln, chmod, and chown.
Processing Files and Data
Learn to use text processing tools: grep, awk, sed, cut, sort, uniq, and wc. For file searching, learn to use find and locate to locate files and directories.
System Debugging
Get familiar with system monitoring and debugging tools such as top, ps, netstat, dmesg, and iotop. Use strace, ltrace, and system logs for performance analysis and issue diagnosis.
One-liners
One-liners are powerful command sequences that perform complex tasks quickly. Examples include sorting and counting occurrences in text files, batch renaming, and system monitoring.
Batch renaming script for changing .txt to .md for all files in a directory:
Obscure but Useful
Specialized commands like expr, cal, yes, env, and printenv offer useful functionalities for specific scenarios.
macOS Only
Mac users have access to unique tools like Homebrew for package management, pbcopy and pbpaste for clipboard interaction, and specific file and system utilities (mdfind, mdls).
Windows Only
Windows users can turn to Cygwin, Windows Subsystem for Linux (WSL), or MinGW for Unix-like command-line environments. Tools like wmic, ipconfig, and PowerShell scripts extend command-line capabilities on Windows.
Playful Commands
By using tools like curl, egrep, tr, and cowsay, you can fetch, process, and display information creatively, showcasing the power and flexibility at your fingertips.
This guide is a useful cheat sheet for learning about new CLI tools and their applications in various scenarios. It is actively maintained, and you can even contribute to the project by creating a pull request. The Master The Art Of Command Line guide is by the community and for the community, so if you find any mistakes or learn something new that is missing, please update the main README.md file.
I hope you learn about new tools and utilities from this guide and apply them to your projects. In my experience, I have used more command-line tools than actual Python code for data projects, especially if you are a data engineer or MLOps engineer.
Further Read
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master’s degree in Technology Management and a bachelor’s degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.