tutoriaLinux – system administration education for the huddled masses

How to tail (follow) Linux Logs

June 13, 2021/in Tutorials, Videos/by Dave Cohen

If you’re wondering how to follow Linux logs for a process or systemd unit (service), here are the commands you want:

Traditional, File-based logs

For a traditional service (a long-running process that logs to one or more files), you can use traditional Linux tools to check the logs. A common command might be:

tail -f /var/log/$YOURLOGFILE

Replace $YOURLOGFILE with the actual name of the logfile — the actual log path may be different, it just depends on what your process/service is configured to use.

The ‘tail’ command will start at the end of the logfile, and the ‘-f’ option will follow the log as new lines are written to it. You’re basically getting a ‘live’ view of the log as it grows.

For a systemd service (systemd unit)

Systemd uses journald (the systemd log journal) to create binary logs (non-plaintext logfiles, i.e. the text would look garbled if you looked at it with traditional Unix/Linux tools like less, tail, cat, etc.).

To view logs through journald, you’ll want these commands:

journalctl $YOURSERVICE

Just calling journalctl with a service (systemd unit) name will get you all logs for that service, starting from the beginning. You can page through with the space bar and quit with the ‘q’ key. Prepare to do a lot of scrolling. A more practical command is:

journalctl -fu $YOURSERVICE

This tells journalctl to do the same thing as “tail -f” — it starts at the end of the logs (the most current ones) and follows new logs ‘live’ as they stream in.

I made a video on this (it also contains a bit more context and an extra trick or two); enjoy!

What Are “DevOps” Skills?

July 29, 2019/in Business, Job Hunt/by Dave Cohen

Someone recently asked me about DevOps ‘courses’ which got me thinking about what the required skillsets are for getting into roles that include the word “DevOps” in the title or description.

I’ve been working in various “DevOpsy” roles professionally for about 7 years now and I still don’t know exactly what it means. Each company seems to have a different definition, and the philosophy that it all started from is now only a distant memory :-D.

That said, every “DevOps” role I’ve had has drawn on some combination of the following skillsets:

Linux Skills
Networking and Web/HTTP skills. And DNS. Always DNS. And caching, oh god the caching problems.
Some cloud provider’s stack — AWS, GCP, Azure, whatever — knowing how different architectures are implemented, using the tools they expose to infrastucture designers and operators.
Familiarity with a CI/CD process (specific tools are usually not important in interviews, as long as you’re comfortable with ONE of them).
Generalized troubleshooting and problem solving skills. Almost every problem you face as a DevOps person will be 15% known, 85% unknown. The ability to quickly learn about the problem domain and start troubleshooting is invaluable.
Be comfortable with the software development process — how software gets written and deployed. Know the basics of software tooling — git, the basics of the language your devs are writing in, debugging tools for that language/environment, etc.
Be *really* comfortable with reading through (and puzzling over) large codebases.

It *really* helps to have some programming (developing software with a team of other devs) experience, although it’s not a hard requirement.

I’m trying to stay away from specific tools recommendations in this post, but several important ones come to mind.

My question to you: what skills do you find yourself using at your DevOpsified job?

How to Install Ansible on Ubuntu Linux

December 30, 2017/in Today I Learned, Tutorials/by Dave Cohen

The official Ansible installation instructions are WRONG, and will result in a bunch of errors and wasted time troubleshooting. Here’s how to install the newest version of Ansible on Ubuntu or Debian (or any other Linux distro, provided you swap out aptitude for your own package management commands).

Note: This installs the NEWEST ‘stable’ version of Ansible, even if Ubuntu’s package repositories are outdated (which they usually are):

sudo -i
apt-get install pip
pip install --upgrade pip
pip install ansible jinja2 pyaml

Troubleshooting Failed Installs Because You Naiively Followed the Official Instructions

Here are the official installation docs, via https://docs.ansible.com/ansible/latest/intro_installation.html#latest-releases-via-pip

# Not really all you need to do
root@jenkins2:~# pip install ansible

First Error: Ansible doesn’t automatically pull in the required jinja2 module/package

Here’s what this looks like, after you run ‘pip install ansible’ as above:

jenkins@jenkins2:~$ ansible --version
Traceback (most recent call last):
 File "/usr/local/bin/ansible", line 40, in <module>
 import ansible.constants as C
 File "/usr/local/lib/python2.7/dist-packages/ansible/constants.py", line 12, in <module>
 from jinja2 import Template
ImportError: No module named jinja2

You get this error because ansible requires the jinja2 module, which isn’t marked as a required package in pip for some reason.

Solution: Install the missing jinja2 module via the jinja2 package in Pip

sudo pip install jinja2

Second Error: Pip doesn’t automatically pull in a yaml package for Ansible

This is the case if you’re getting an error like the following one:

jenkins@jenkins2:~$ ansible --version
Traceback (most recent call last):
 File "/usr/local/bin/ansible", line 40, in <module>
 import ansible.constants as C
 File "/usr/local/lib/python2.7/dist-packages/ansible/constants.py", line 18, in <module>
 from ansible.config.manager import ConfigManager, ensure_type, get_ini_config_value
 File "/usr/local/lib/python2.7/dist-packages/ansible/config/manager.py", line 11, in <module>
 import yaml

Solution: Install the yaml module via the pyaml package in Pip

sudo pip install pyaml

This is a stupidly named package in Pip — it’s easy to get the name wrong, as below. The solution is to just install the ‘pyaml’ package instead.

# ERROR: Install pyaml (it's not called yaml)
root@jenkins2:~# pip install yaml
Collecting yaml
 Could not find a version that satisfies the requirement yaml (from versions: )
No matching distribution found for yaml

So if you followed the instructions at the beginning of this post, OR you had to troubleshoot and start in the middle of this post somewhere, you should now have a functioning ansible install. If running ‘ansible –version’ doesn’t print out a version number, it should at least point you to the next error you need to troubleshoot.

Here’s an example from a Jenkins build server:

jenkins@jenkins-server:~$ ansible --version
ansible 2.4.2.0
 config file = /etc/ansible/ansible.cfg
 configured module search path = [u'/var/lib/jenkins/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
 ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
 executable location = /usr/local/bin/ansible
 python version = 2.7.12 (default, Nov 20 2017, 18:23:56) [GCC 5.4.0 20160609]

I hope that helps to guide you through the frustrating path of bad package naming, outdated repositories, and broken tooling that is (seemingly always) the current state of affairs in Ansible-land!

By the way, if you want to learn more about Linux, Programming, and other tech stuff, I’ve got a bunch of free Linux video tutorials up on my YouTube Channel: https://www.youtube.com/c/tutorialinux

Check it out!

My Favorite Computer Games

November 24, 2017/in Fun and Games/by Dave Cohen

I’ve played a lot of computer games over the years. They’re what got me into computers in the ’90s, and several of them are responsible for me getting into Linux, programming, and system administration.

Just FYI: these contain affiliate links to the humble bundle store.

Deus Ex (the original 1999 masterpiece)

Easily my favorite game of all time. Got me interested in hacking and computer systems.

Buy it on Steam (usually around $6)

Deus Ex: Mankind Divided

A worthy follow-up to the original. The first one of the sequels to recreate a lot of the atmosphere that I loved in Deus Ex 1999. Buy Deus Ex: Mankind Divided on Steam.

More coming…

Thoughts Worth Thinking About

September 22, 2017/in Programming, Today I Learned/by Dave Cohen

I was just chatting with a friend about software, infrastructure, modern architectures and methodologies, and just sysadmin stuff in general. After getting up for a few minutes to grab a drink, here’s what I found waiting for me in the chat room. I’m sharing it, with his permission:

I find it funny. Everyone is like, “my servers are cattle, not pets”, “no snowflakes” and “use Docker,” with the result that the infrastrcture and host systems are the worst form of this. xD

But to be honest, I am realizing that the cattle/pet thing is wrong. I want my server to be a shepherd dog and my processes/services to be cattle. I want my dog to be smart and take care of things and be reliable, exactly because my cattle isn’t.

I want it to do the right thing at the right moment and in the right way. I want to improve my servers every time something goes wrong and not have them be as dumb as the first time it happened.

And the snowflake. No, I don’t want my system to be like every other. In that case I would just make a distro that is like that and just install it. No architecture/design of system/infrastructure required.

It’s like.. just using WordPress instead of starting your own thing. “But that way it’s easier, and you don’t have your snowflake web app.”

Oh and you can google stack traces.

Why program at all?

It’s also a bit like when people thought it was a good idea to use Microsoft Office (or early versions of Frontpage) to make web pages. xD

The WP comparison probably also works with OSs. “Why not just use Ubuntu” is like “why not just use WP?” Big community, widely used language.

It’s worth considering. Thinking carefully about fads and hype, before going along with them, is a good thing. And even if a lot of these technologies and approaches have merit, they are not a ‘one size fits all’ answer. If something doesn’t make sense to you, always ask more questions, do more research, and try things out yourself. That’s what makes us better than machines :-).

I was wrong about Docker

May 22, 2017/in Business, Job Hunt, Today I Learned, Tools/by Dave Cohen

I recently found an answer that I gave someone to a Facebook chat question in ~2016-2017, and was amazed at how “technically right” I was (i.e. totally wrong for the career question that I was answering).

I just wanted to post this to show that I am often too technologically focused, and that this can often get in the way of getting hired and getting things done the way everyone else is doing them. On the flipside, my heart is in the right place and this approach certainly wouldn’t have HURT the person I was giving advice to.

Here it is!

Yeah, I should make some docker vids! So, LXC is the original Linux-native implementation of containers. There was an implementation before, called OpenVZ, but that relied on a patched Linux kernel and I don’t think it’s used much anymore.

The mainline kernel itself exposes the “ingredients” for containers (user/process namespacing, cgroups, etc.) and LXC was one of the original implementations of ‘containers’ using these native building blocks.

Docker was originally a wrapper around LXC, but has now actually created their own implementation of the container back-end (using the same exposed ‘ingredients’ from the Linux kernel that LXC uses).

LXC (and now LXD) is under-hyped, solid, and a bit more intuitive for learning (more like FreeBSD Jails, less hype/magic than Docker). I think it’s good to learn LXC or FreeBSD jails before jumping into Docker.

All the concepts from LXC are totally applicable to Docker as well.

Docker, as a software product, layers some extra tools on top of this basic ‘containerization’ tech: they have their docker ‘hubs’ — pre-built application containers. They have extra networking stuff on top of what you’d get with LXC. They have something that looks like very basic container linking/service discovery. They have enormous amounts of hype, which can help you get a high-paying job (seriously. It’s worth learning just for that.)

I’m torn, because I’ve used Docker since before it was an open-source product (it used to be a paid thing called DotCloud), and have hated it for almost the entire time. It’s a complex, unstable, and questionably architected piece of software. It’s also incredibly overhyped and misused.

BUT: Docker has some great ideas in it, and is absolutely worth learning for your career. As a technology, it leaves a lot to be desired.

I’d love to hear what you think of Docker after getting comfortable with LXC. LXC will teach you the concepts that underlie all of this containerization stuff, and Docker adds some new features (and new headaches) onto that.

Let me know what you think, if/when you start your Docker journey!

The Hardest (and most fun) Problems to Troubleshoot

April 30, 2017/in Today I Learned, Uncategorized/by Dave Cohen

I recently wrote a FAQ-style post about System Administration and technology careers in general. One of the best questions I was asked was about what kinds of really interesting troubleshooting problems I’ve had to deal with. Here’s that question, along with my answer:

What’s one of the most interesting things you’ve had to troubleshoot / do while maintaining a system?

I’m leaving out specific examples because they’re a mixture of non-public information and hyperspecific (uninteresting) technical stuff, but I can give some outlines for what generally makes for interesting problems to solve.

The really interesting problems I’ve seen tend to be related to performance, networking, and distributed systems. Usually they require a combination of different knowledge to solve:

Systems/OS: What is the operating system doing when everything slows down? What’s causing it to do that?
Networking/Distributed Systems: What’s actually happening when these machines communicate? How are they supposed to share and manage state, deal with network partitions, and ensure high availability? What are they *actually* doing when this problem happens?
Software Development: Which part of the code is causing this network/OS issue, and which code path leads there? Can I actually look at and modify this code? Is this code written by our developers, or an open-source project? What can I do to confirm the issue and test a fix? Can I contribute a fix back to the upstream project?

System Administration Careers: Frequently Asked Questions

April 30, 2017/in Job Hunt/by Dave Cohen

If you’re wondering whether or not a System Administration career is right for you, this article might help. Someone just sent me some questions about what a career in System Administration is like, and asked some questions that I hadn’t thought to answer on YouTube or here on the tutorialinux blog. Since I’ve been working in various IT Operations and Software Development disciplines professionally for the last 7 years, I love talking about this stuff. Here are my two cents:

What are the Responsibilities a Linux / Unix System Administrator?

Architecting, Building, and Troubleshooting/Maintaining the technical infrastructure for a company or software product. At a high level, you’re responsible for figuring out what the technical requirements are and using existing software/products/tools to provide those requirements (or occasionally writing your own). Early-career focuses on implementing predefined task-chunks:

Figure out why this server is doing XYZ
Fix a problem with our software deployment automation code
Increase storage size for one of the database clusters
Replace failed disks at a datacenter
Create a web server configuration file for a new project that Dev is working on

Senior-level positions often have more design/architecture:

Deep introspection of the OS and the software product you’re supporting
Design a new deployment pipeline
Build a technical team for a new project
Evaluate new software that would change how we run our infrastructure (e.g. containers + scheduler vs. VMs + config management)
Troubleshoot tough problems (OS-level issues, bugs arising from a confluence of several edge cases across the stack, etc.)

What education/experience/credentials are required to do this job?

How to Record Your Work on the Command Line with the script(1) Command

March 13, 2017/in Today I Learned, Tools/by Christian

Most Unix-like operating systems feature a script command. You can find its manual in script(1) (type man 1 script to access it). script records a transcript (“typescript,” not to be confused with the language TypeScript) of your current session in the command line.

The script command can be used as a way to log what you are doing in a shell session. It’s often used during troubleshooting, documentation, PCI compliance audits, security/remediation work, penetration tests, and other situations where it’s useful to record a play-by-play log of what you’re doing on the machine.

Practical Demonstration

Amazon AWS Basics Tutorial: Setting Up a Load-Balanced, Auto-Scaling Webserver

February 4, 2017/in Tutorials, Videos/by Dave Cohen

This step-by-step tutorial will show you how to build load-balanced, highly available, self-healing infrastructure on the Amazon Web Services (AWS) Cloud.

If you’ve been wondering how to get started with “DevOps,” “Cloud” System Administration, and public cloud providers in general, this is the series for you.

Amazon’s Cloud Services are vast — there are an enormous amount of things you can use:

services that replicate traditional Virtual Machines in a datacenter (EC2, RDS, Elasticache)
services that give you insane amounts of availability and durability, along with an innovative API (S3 for storage)
replacements for other solutions in brand new industries (ECS)
plenty of unique services that no one else is offering

If you really want to learn this stuff, you want to start with simple, practical examples that slowly walk you through the huge amount of new concepts that you’ll need to learn. You need something practical that you can use in real life, so that navigating Amazon’s immensely complex graphical user interface becomes second-nature.

Once you’ve got a grasp on the basic concepts and the most important services, we can dive into building more realistic, more complex infrastructure that uses specific AWS features to get the job done.

We’ll also explore the AWS API, which allows you to use a command-line client or bindings for your favorite programming language to work with infrastructure. This is the preferred way that professionals use to interact with AWS.

Video #1: Intro and EC2 Instance Creation

In the first video, you’ll learn the basics of working in the Amazon AWS Console (their graphical user interface) and set up an EC2 instance with a webserver on it. You’ll create your first security group (AWS network firewall rules), set up SSH keys, and connect to your ‘development’ instance for the first time.

Once you’re ready for launch, you’ll learn how to create a new Amazon Machine Image (AMI) from your EC2 instance, which we can use to spin up clones of that instance later on.

Video #2: Auto-Scaling Groups and Launch Configurations

In the second video, I’ll show you how to set up an Auto-Scaling Group (ASG) which uses a Launch Configuration to spin up new EC2 instances (clones of your development instance). We’ll talk a bit about scaling and how you’d set it up in a real project.

Video #3: Tying it together with an Elastic Load Balancer

In the third video, you’ll be setting up an Amazon Elastic Load Balancer (ELB) to balance traffic across the instances in your Auto-Scaling Group. I’ll show you how to add health checks and hook up the Application Load Balancer to your ASG, so it automatically balances across healthy instances in your ASG, even as they’re added and removed.

There’s more coming, so stay tuned!