CLI text processing with GNU awk
About
When it comes to command line text processing, the three major pillars are grep
for filtering, sed
for substitution and awk
for field processing. This book will dive deep into field processing, show examples for filtering features, multiple file processing, how to construct solutions that depend on multiple records, how to compare records and fields between two or more files, how to identify duplicates while maintaining input order and so on.
This book heavily leans on examples to present features one by one. Regular Expressions will also be discussed in detail.
Exercises are also included to test your understanding.
Promo video
Prerequisites
You should be familiar with command line usage in a Unix-like environment. You should also be comfortable with concepts like file redirection and command pipelines. Knowing the basics of the grep
and sed
commands will be handy in understanding the filtering and substitution features of awk
.
As awk
is a programming language, you are also expected to be familiar with concepts like variables, printing, functions, control structures, arrays and so on.
You are also expected to get comfortable with reading manuals, searching online, visiting external links provided for further reading, tinkering with illustrated examples, asking for help when you are stuck and so on. In other words, be proactive and curious instead of just consuming the content passively.
If you are new to the world of the command line, check out my Computing from the Command Line ebook and curated resources on Linux CLI and Shell scripting before starting this book.
Testimonials
Step up your cli fu with this fabulous intro & deep dive into awk. I learned a ton of tricks! — feedback on twitter
I consider myself pretty experienced at shell-fu and capable of doing most things I set out to achieve in either bash scripts or fearless one-liners. However, my awk is rudimentary at best, I think mostly because it's such an unforgiving environment to experiment in. These books you've written are great for a bit of first principles insight and then quickly building up to functional usage. I will have no hesitation in referring colleagues to them! — feedback on Hacker News
Sample chapters
For a preview of the book, see sample chapters on GitHub.
GitHub repo
Visit https://github.com/learnbyexample/learn_gnuawk for markdown source, example files, exercise solutions and other details related to the book.
Interactive exercises
Based on the book contents as well as the exercises, I made an interactive TUI app with 80+ questions. Reference solutions are also provided.
Chapters
- Preface
- Installation and Documentation
- awk introduction
- Regular Expressions
- Field separators
- Record separators
- In-place file editing
- Using shell variables
- Control Structures
- Built-in functions
- Multiple file input
- Processing multiple records
- Two file processing
- Dealing with duplicates
- awk scripts
- Gotchas and Tips
- Further Reading
Feedback and Errata
I would highly appreciate it if you'd let me know how you felt about this ebook. It could be anything from a simple thank you, Gumroad rating, pointing out a typo, mistakes in code snippets, which aspects of the book worked for you (or didn't!) and so on. Reader feedback is essential and especially so for self-published authors.
You can reach me via:
- Issue Manager: https://github.com/learnbyexample/learn_gnuawk/issues
- E-mail: learnbyexample.net@gmail.com
- Twitter: https://twitter.com/learn_byexample
You'll get PDF and EPUB versions of the book.