Skip to content

Linux

Linux: Troubleshooting Hardware, Storage, and Linux OS

Troubleshooting steps:

  1. Identify the problem
  2. Establish a theory of probable cause
  3. Test the theory to confirm or refute the theory
  4. Establish a plan of action, implement the solution or escalated if needed, and then verify full system functionality
  5. Implement preventive measures to avoid recurrence and perform a root cause analysis

Boot Issues

Server Not Turning On

  • No power lights?
  • No fans?
  • No console output?
  • Do similar systems have the same issues?
  • Maybe the PDU is down?
  • Maybe the PSU has failed?
  • Check the power in the PDU
  • Swap in a known-good power cable
  • Plug another device into the same outlet
  • Still failing?
  • Inspect the PSU
  • Reseat connectors
  • Swap in a spare PSU
  • Verify the system powers on
  • Label cables
  • Schedule PSU health checks
  • Perform a root cause analysis

GRUB Misconfigurations

  • The server drops to a GRUB rescue prompt?
  • The server show an error like "file not found"
  • Are multiple kernels failing?
  • Maybe /etc/default/grub was edited?
  • Maybe an entry ininitrd was deleted?
  • Use the GRUB cli to probe available partitions
  • Verify the kernel and initramfs files are where GRUB expects them to be
  • Boot from rescue ISO or live environment
  • Mount the root filesystem
  • Correct the UUID or kernel path in /etc/default/grub
  • Regenerate GRUB configuration: grub2-mkconfig -o /boot/grub2/grub.cfg on RHEL-based systems. Or updated-grup on debian
  • Reboot and verify the kernel load properly
  • Backup grub.cfg before modifications
  • Why the issue occurred in the first place?
  • A rushed update?
  • A lack of peer review?

Kernel Corruption Issues

  • Observing errors such as "bad magic number" or "kernel image corrupt" during boot
  • Check whether only the latest kernel version is affected or if other versions work from te GRUB menu
  • Maybe a package update failed mid-install
  • Maybe the /boot partition has disk errors
  • Boot into an older, working kernel
  • Mounting /boot and verifying file checksums. If checksums fail, the corruption is real
  • Reinstall the corrupted kernel package
  • Reboot to verify that the new kernel loads
  • Monitor disk health
  • Ensure updates are completed successfully
  • See if disk failure or an interrupted update was at fault

Missing or Disabled Drivers

  • Boot hangs or drops to an initramfs shell with errors like "VFS: Cannot open root device"
  • Check if only certain hardware (for example, a RAID controller) is missing in /dev pr /sys
  • Maybe the initramfs was rebuilt without necessary driver module
  • Maybe someone blacklisted a driver
  • Examine the initramfs contents with lsinitrd or dracut --list module to confirm if the driver is absent
  • Rebuild the initramfs including the required modules
  • Reboot to verify that the driver loads and the root filesystem is detected
  • Document driver dependencies in the build scripts
  • Automate initramfs rebuilds when kernel updates occur
  • Was a kernel package change or manual configuration error caused the driver omission?

Kernel Panic Events

  • Read the panic message on the console
  • Does the panic happens on every boot or only after certain changes?
  • Maybe a newly added module is incompatible
  • Maybe the memory has gone bad
  • Let's try booting with a previous kernel
  • Run memtest86+
  • Disable suspect modules via the kernel boot line
  • Remove or update the offending module
  • Roll back to a known-good kernel
  • Replace faulty RAM
  • Reboot and verify full functionality
  • Maintain a reliable kernel testing process
  • Monitor hardware health
  • Keep a cross-tested module database
  • What was the root cause? Was it a faulty drover, hardware failure, or human error caused the panic.

Filesystem Issues

Filesystem not mounting

  • The usual mount command returns errors
  • Scheduled backups and applications suddenly cannot access certain directories
  • Errors like unknown filesystem type, mount: wrong fs type, superblock corrupt in system logs
  • Boot into rescue mode or unmount any stale references, run fsck against the affected device, and inspect or repair the superblock if needed.
  • If the issue arise from /etc/fstab, correct the UUID or device path and then test the mount manually before updating the fstab.
  • The system now mount cleanly?
  • Confirm read/write access and update any monitoring dashboards to reflect that the volume is back online

Partition not writable

  • Processing failing with permission denied message
  • Application unable to save files even the directories appears to exist
  • Maybe the filesystem is mounted readonly. Examine /proc/mounts to confirm ro flag
  • Unmount the partition, run fsck to repair any underlying errors, and then remount it with the correct read-write permissions
  • Does the issue persist?
  • Inspect ownership and ACLs, then apply chmod or chown to grant the correct user or service write access
  • Update any configuration management scripts

OS filesystem is full

  • Applications and users are unable to write logs and files
  • Check partition usage to confirm issue
  • Truncate or rotate logs, cleanup old core dumps, purge orphaned Docker images, or archive older data to a secondary storage
  • Extend the LVM volume or resize the partition, the resize the filesystem
  • Implement proactive monitoring for storage space

Inode exhaustion

  • df -h my show that space is available
  • Typical message: Cannot create file: No space left on device
  • check df -i and see if inode count is at 100%
  • Identify directories with excessive file counts and then clean up old or stale files
  • Create a new file system with higher inode ration and then migrate the data if necessary
  • Update cleanup policies or add scripts to remove temporary files automatically, preventing a repeat of the issue

Quota issues

  • Individual user or group cannot write files despite free space in the partition
  • Typical message is Disk quota exceeded when creating or writing to a file
  • Use repquota -a and quota -u <USERNAME> to view group or user quotas
  • Adjust soft and hard limits if necessary
  • Identify and remove unnecessary data from the user's home or project directories

Process Issues

Unresponsive Processes

It occur when a running program stops responding to inputs or system scheduling, causing tasks to hang indefinitely.

  • Don't respond to user input or system event?
  • Consume more resources
  • Spot this with top and ps
  • Use strace to watch process
  • Send SIGTERM to le the process shut down cleanly, and if that fails, escalate to SIGKILL to free resources by force
  • Examine journalctl to determine what cause the process to become unresponsive
  • Implement preventive measures

Killed Processes

They happen when a process is forcibly terminated by a signal.

  • Check journalctl and dmesg for reason the process was killed
  • Logs may show Killed process <PID> or oom_reaper to indicate killed process
  • Go through logs to determine if system or person killed the process

Segmentation Fault

A crash that happens when a program tries to access memory it shouldn't, leading to an abrupt termination with an error message.

  • Configure system to generate and retain core dump
  • Use GNU Debugger to analyze the core file and pinpoint the faulty code patch
  • Is the issue from an package? Reinstall a version of the package without that bug

Memory Leaks

Memory Leaks occur when a program continuously allocates memory without freeing it, gradually exhausting available RAM and. degrading system performance

  • Watch the RES memory rise steadily with no drop
  • Who is reserving the memory? review logs and output
  • Schedule periodic restarts of the service or allocate more RAM to reduce impact
  • Continue monitoring RES

System Issues

Device Failure

The server suddenly cannot read from or write to a critical piece of hardware, which is often a disk or network interface.

  • Identify the faulty device
  • Reseat or replace the device
  • If it is a RAID disk, mark the bad disk as failed and rebuild the array with a spare
  • Check disks and network to confirm full functionality of the system

Data corruptions issues

They occur when files refuse to open, applications crash, or filesystem errors in system logs.

  • Run fsck to detect corrupted data
  • Is there a known-good backup? restore from backup
  • Use fsck with repair options to attempt recovery on the live server
  • What was the root cause? failing disk? power outage?, ...
  • Verify full system functionality before it returns to production

Systemd unit failures

They occur when a service that should be running won't start or crashes immediately

  • Inspect service with systemctl status <SERVICE> or journalctl
  • Maybe edit the unit config in /etc/systemd/system/
  • Run systemctl daemon-reload to apply changes
  • Start service with systemctl start <SERVICE>
  • Setup alert to catch unit failures

Server inaccessible

User cannot remotely access the server.

  • Does ping timeout?
  • Does SSH hang?
  • Out-of-band tool does not respond?
  • Are other servers in the network reachable?
  • Try physical access
  • Maybe reboot the machine, or restore network configs from backups, or repair corrupt network service files
  • Validate server is reachable again

Dependency Issues

Package Dependency Issues

Occur when software cannot find or install the components it needs

  • Are the necessary repository enabled?
  • Run dnf deplist <PACKAGE> or apt-cache depends <PACKAGE> to find missing dependencies
  • Upgrade or downgrade package if necessary
  • Rerun installation and verify software loads without issues

Path Misconfiguration Issues

Occur when the system cannot locate a program despite being installed

Typical error message is Command not found

  • Examine echo $PATH to check current search directories
  • Add missing directory by editing /etc/profile or similar
  • Reload shell or re-login to apply changes
  • Run command again to confirm the program is found
  • Document changes for future deployments

Linux: Monitoring Concepts and Configurations

Service Monitoring

Service Level Indicators (SLIs) are specific metrics such as uptime, response time, or error rates. It is used to measure the performance of a service.

Service Level Objectives (SLOs) are targets to meet based on measurements such as maintaining 99.9 percent uptime.

Service Level Agreements (SLAs) is a formal promises to customers or stakeholders outlining expected level of service and consequences if expectations are not met.

Network Monitoring

Network monitoring is the process of keeping track of devices like routers, switches, and servers to make sure everything is running properly.

SNMP - Simple Network Monitoring Protocol

SNMP allows devices to report performance data using a structure called MIB, or Management Information Base. The MIB acts as a built-in database that defines everything that can be monitored on a device, including CPU load, memory usage, and network interface status.

The MIB contains Object Identifier (OID). and OID is a unique number used to locate and retrieve specific information.

SNMP Traps are automatic alerts triggered by specific events like hardware failure or dropped network connections.

Agent-agent vs Agentless Monitoring

Agent-based monitoring uses a software on the monitored device to collect monitored information. SNMP is an agent-based monitoring tool.

An agentless monitoring collects data using existing remote access protocols without requiring any additional software installation on the monitored devices. On Windows systems, protocols like Windows Management Instrumentation allow similar agentless access.

Event-driven Data Collection

Health Checks

Health checks allow systems to automatically test whether a service is running and responding as expected.

# checks if a web service returns a success response
curl -I http://localhost

# check if a systemd service is up and running
systemctl is-active ssh

Webhooks

Webhooks are often used for realtime integrations between services.

Log Aggregation

Log aggregations is the collection of logs from across the network and storing them in a central location.

Event Management

Logging

Logging provides the raw data needed to understand what is happening across a system. Logs are typically stored in the directory /var/log/ and includes files like syslog, auth.log, dmesg, and more.

SIEM Security Information and Event Management System. It collects and analyzes logs from across the network to help identify security threats, system issues, and unusual activity in real time.

Events

Events are generated when specific patterns or conditions are detected in the log data that indicate something noteworthy has happened.

Alerting and Notifications

Notifications

Notifications are how a Linux admin is informed when the system detects that something may require attention. They can be sent via Email, Text Messages, Desktop pop-ups, ticketing system or collaboration platforms.

Alerts

Alerts are the system's internal triggers that causes te notifications to be sent.

Linux: Automated Tasks with Shell Scripting

Parameter Expansion

Parameter Expansion is a way to substitute the value of a variable into a command or script os that the instructions become dynamic and flexible instead of static. ex: ${var}

${var}

${var} is used in shell environments to insert the value of a variable into a command. var is the name of the variable we want to expand.

ex:

location="/var/log"

cd ${location}

Command Substitution

Command Substitution inserts the result of a command directly into another command or script.

'bar' - Single-Quoted String

Everything in the single quote is treated as literal text. There will be no variable expansion and no command substitution. The text will be printed as it is written.

ex:

echo 'Warning: $PATH cannot be found'

Warning: $PATH cannot be found

$(bar) - Substituting a Command

This is how command substitution is done. This will run the command inside the parentheses by replacing the $(...) with the command's output.

# /backup/YYYY-MM-DD
mkdir /backup/$(date +%F) # mkdir /backup/2025-11-15

Subshell Execution

A subshell is a separate child process created by the shell to execute a command or group of commands in isolation without affecting the current shell environment. What ever happens inside the subshell will not carry over to the main shell session.

(bar) - Creating a Subshell

The syntax is (cmd1; cmd2;...). All commands inside the parentheses are executed in a child shell.

ex:

# execute the command in a new shell
(bar)

Functions

A function is a set of commands packaged under a single name to allow repeated use without rewriting the commands each time.

ex:

function hello {
  echo "Hello, $1"
}

hello() {
  echo "Hello, $1"
}

Bash functions can only return numeric exit codes.

Variables by default are global. Use local to define a local variable in functions.

ex:

# to define a local variable in a function
function hello {
  local my_var="Hello"
}

Internal Field Separator / Output Field Separator

IFS tells the shell where to split input into distinct words.

OFS is a tool used to re-assemble data for output.

Avoiding Word Splitting

Word Splitting is the shell's habit of treating spaces, tabs, and newline inside a variable as natural break-point. To fix this, we wrap the variable in double quotes or pass it through printf. ex: printf '%s\n' "$variable".

With

file_path="My project/file.txt
cat file_path

the shell will attempt to open 2 files: My and project/file.txt.

But printf '%s\n' "$file_path" will produces the exact string, in one line, with no splits.

Controlling Input Splitting

IFS=<DELIMITER> read -r VAR1 VAR2,... <<< "$TEXT"

ex:

IFS=',' read -r name city role <<< "tome,New York,Developer"

Output Formatting

A common pattern is awk 'BEGIN{OFS="<DELIMITER>"} {print $1,$2,...}' <FILE>

ex:

# converting a portion of /etc/passwd file into a CSV
awk 'BEGIN{OFS=","} {print $1,$3,$4}' /etc/passwd | head -n 3

BEGIN{OFS=","} tells awk that commas should go between fields.

$1, $3, and $4 refers to the username, uid, and gid columns respectively.

Conditional Statements

if

It is used for running a single yes or no task like:

  • Verifying a service is running

  • Checking free disk space

  • Making sure a variable isn't empty

if condition; then
    commands
elif another_condition; then
    commands
else
    commands
fi
# to check for a file
location="/var/log/auth.log"

if [[ -f $location ]]; then
    echo "$location exists"
elif [[ -d $location ]]; then
    echo "$location is a directory"
else
    echo "$location does not exist"
fi

Options include:

  • -f for a file

  • -d for a directory

  • -z for a string

  • -eq numeric equal

  • -ne numeric not equal

  • -lt numeric less than

  • -gt numeric greater than

  • = string equal

  • != string not equal

case

A case statement is used when a variable can take several acceptable values, or answers, and needed different actions for each.

case expression in
    pattern1)
        commands ;;
    pattern2|pattern3)
        commands ;;
    *)
        commands ;;   # default case
esac
echo "Select an option: start | stop | restart"
read action

case $action in
    start)
        echo "Starting service..." ;;
    stop)
        echo "Stopping service..." ;;
    restart|reload)
        echo "Restarting service..." ;;
    *)
        echo "Unknown option: $action" ;;
esac

$1 is a positional parameter. It means it automatically holds the first command-line argument that was supplied when the script was launched.

Looping Statements

Loops allow a program to repeat actions automatically without rewriting the same instructions repeatedly.

for

for loop repeats a task a specific number of times of for each item in th a list.

ex:

for fruit in orange apple banana
  do
    echo "fruit: $fruit"
  done

while

while loop continues running as long as a condition remains true. A while loop is great when you do not know how many times something should repeat.

counter=1
while [ $counter -le 5 ]
  do
    echo "count is $counter"
    ((counter++))
  done

until

until runs until a condition becomes true.

counter=1
until [ $counter -ge 5 ]
  do
    echo "count is $counter"
    ((counter++))
  done

Interpreter Directive

An interpreter directive is a special line at the very top of the file that tells the system which program should be used to interpret the commands that follow.

It start with #! called shebang followed by the path of the interpreter like /bin/bash.

For bash script, we typically use #!/bin/bash

ex:

hello.sh

#!/bin/bash

echo "hello world"

Numerical Comparisons

  • -eq equal to

  • -ne not equal to

  • -lt less than

  • -le less than or equal to

  • -gt greater than

  • -ge greater than or equal to

They are always used in [] when making comparisons.

result=8
if [ "$result" -lt 5 ]; then
    echo "Less than 5"
elif [ "$result" -eq 5 ]; then
    echo "Equal to 5"
else
    echo "Greater than 5"
fi

Redirection String Operators

> redirection operator

> redirects outputs to a file. It creates the file automatically if it does not exist or overwrite its content if it exists.

echo "Operation completed with code 0" > result.txt

< redirection operator

< takes input from a file.

read value < input.txt

Comparison String Operators

'String comparison operators check whether two pieces of text are the same, different, match a pattern or follow a certain alphabetical order.

  • == and = for comparing if two strings are equal
  • != for checking if two strings are not equal
  • =~ for matching patterns using regular expressions
  • <= and >= for comparing string alphabetical order

==, =, and =~

== is typically used inside double square brackets ([[]]) and is read as is equal to.

= is used inside single square brackets ([]) and is read simply as equals.

=~ is used for more advanced comparison.

#!/bin/bash

text="Hello"

if [[$text == "Hello"]]; then
  echo "Text is exactly Hello"
fi

if [[$test =~ ^H]]; then
  echo "The test start with H"
fi

!=

#!/bin/bash

result="completed"

if [$result != "completed"]; then
  echo "The task completed successfully"
fi

<= and >=

This is a Lexicographical Comparison. Bash checks which string would come first or last in alphabetical order.

<= is read as is less than or equal to

>= is read as is greater than or equal to

#!/bin/bash

fruit="papaya"

if [[$fruit >= "mango"]]; then
  echo "$fruit comes after or is equal to mango"
fi

if [[$fruit <= "melon"]]; then
  echo "$fruit comes before or is equal to melon"
fi

Regular Expressions

A regex is a special sequence of characters that defines a search pattern.

Bash uses =~ inside [[]] to match patterns with regular expression ([[ $variable =~ pattern]]).

#!/bin/bash

data="234567"

if [[ $data =~ ^[0-9]+$ ]]; then
  echo "The data contains only numbers"
fi

Test Operators

Test operators are special symbols used to evaluate things like file existence, string content, and logical conditions. They return either true or false.

-d and -f

-d and -f are operators used in scripts to check whether something exists on the filesystem and whether it's is a directory or a regular file.

#!/bin/bash

if [ -d "project" ]; then
  echo "the project folder is a directory"
fi

if [ -f "app.conf" ]; then
  echo "app.conf is a file"
fi

-n and -z

-n and -z are string test operators. They help check whether a string has a value or is empty, which is especially useful when dealing with user input.

#!/bin/bash

input=""

if [ -z "$input" ]; then 
  echo "The input is empty"
fi

input="hello"

if [ -n "$input" ]; then
  echo "the input is not empty"
fi

!

! is the logical negation operator.

#!/bin/bash

if [ ! -f "config.txt" ]; then
  echo "config file does not exist"
fi

Variables

Variables are used to store and work with information like text, numbers, or user input.

Positional Arguments

Positional arguments are values passed to a script when running it, allowing the script to respond to user input. The first argument is $1, the second is $2 and so on.

#!/bin/bash

if [ $1 -gt 5 ]; then
  echo "The number is greater than 5"
else
  echo "The number is less than or equal to 5"
fi
# then run the script
./script.sh 10

Environment Variable

Environment variables are built-in variables provided by the system or user that store important information.

Built-in variables:

  • $USER: username
  • $HOME: home directory
  • $SHELL: current shell
#!/bin/bash

if [ $USER = 'root' ]; then
  echo "You are logged in as the root user"
else
  echo "You are logged as regular user $USER"
fi

Alias and Command Management

alias

alias command creates shortcuts for longer commands. The generic syntax is alias name='command'.

Aliases set in the terminal are only temporary and only last for that session.

# create a shortcut called ckdsk
alias ckdsk='df -h'

unalias

unalias command removes shortcuts that was previously created.

# to remove a previously created alias
unalias ckdsk

set

set modify the behavior of the shell.

#!/bin/bash
# to stop script from running if any command inside it fails
set -e

echo "running system update..."

sudo dnf update

echo "update completed"

Other options with set:

  • -x prints each command before it is executed
  • -u exits script when attempting to use an undefined variable
  • -o pipefail makes a pipeline fail if any command in the pipeline fails

Variable Management

  • export allows a variable to be passed to child processes
  • local restricts a variable's scope to within a function
  • unset deletes a variable

export

export is used to make a variable available to child processes, such as a subshell or another script that is launched from the current shell. The syntax is export VARIABLE=value

export LOG_LEVEL=debug

./myscript.sh # run in a separate shell process bt still has access to LOG_LEVEL because of 'export'

local

local command is used to restrict variable to within a function. The syntax is local VARIABLE=value

unset

unset is used to remove a variable. The syntax is unset VARIABLE

log_file="log.txt"

echo "processing file"

unset log_file

Return Codes

A return code or exit status is a number left behind after a command or program finishes in Linux to indicate success or failure.

$? is used to see the exit code of the last command.

  • 0 means success
  • Non-zero means error

Linux: Automation and Orchestration

Ansible IaC Core Concepts

Ansible let users automate system configuration and management using clear, repeatable commands. Ansible is agentless. I uses SSH on Linux and WinRM on Windows.

Installing Ansible

# on RHEL-based system
dnf install -y ansible-core

# on Debian-based system
apt install -y ansible

# to test install
ansible --version
ansible localhost -m ping

Inventory

Inventory is a list of all the servers or devices. Inventory can store as simple text file using INI format, as structured YAML file, or dynamically from cloud platforms or CMDBs.

[web] Group servers for easy management

ex inventory file

# ./hosts
[local]
localhost ansible_connection=local
# to install htop on RHEL-based system
ansible -i hosts local -m dnf -a "name=htop state=present update_cache=yes" --become

# to create a new user
ansible -i hosts local -m user -a "name=bob state=present" --become

# to copy a file from the control node to the managed nodes
ansible -i hosts local -m copy -a "src=my_config.conf dest=/apps/myapp.conf" --become

Ad Hoc Mode

Ad Hoc Mode is used to run one-time commands to test settings or apply changes across systems.

ex:

# to ping all hosts listed in the inventory
ansible all -m ping

# to restart the nginx service in the web group
ansible web -m service -a "name=nginx state=restarted"

Module

A module is a built-in tool that handles specific tasks like installing software, restarting services, or managing users.

ex of modules:

  • yum: used to install, update, remove packages on RHEL-based systems
  • apt: used to install, update, remove packages on Debian-based systems
  • user: used to manage user accounts on the system
  • service: used to start, stop, restart, or enable services
  • copy: used to transfer files from the control node to remote machines
  • file: used to create directories, change permissions, or delete files

Playbook

Playbook are complex, repeatable tasks, and structured automation. It is a structured YAML file that defines a set of tasks for Ansible to carry out on managed system.

Facts

Facts allows Ansible to automatically gather information about each machine and make decisions based on that data. Data collected can include IP addresses, operating system, available memory, and disk space. They are gathered at the beginning of playbook execution so it can decide what action to take based on the current setup of the machine. Ansible collects facts only when users run a task, using a direct connection like SSH.

Collections

Collections help manage and reuse tools, making it easier to scale and maintain automation environment over time.

Puppet Core Usage

Puppet helps automate system configuration by letting admins describe what the system should look like. Puppet is agent-based.

The Puppet Agent is responsible for communicating with the Puppet server and applying configurations. It is also responsible of collecting facts.

The Puppet server is called Puppet Master.

Facts

Facts are information Puppet collects on the managed devices such as operating system, hostname, IP addresses, memory, and more. Puppet Agent collects facts on a regular schedule during each check-in with the Puppet server. Puppet is well suited for large-scale enterprise environment because it enforces regular automated configuration.

Classes

Classes group related configuration tasks together into one logical unit. They help apply consistent settings to many systems with minimal duplication of effort.

Modules

A module is a package that includes everything needed to manage a specific task or part of a system. It can include one or many classes, files, templates, or custom facts.

Certificates

Certificates ensure that only authorized machines are allowed to talk to the server and receive configurations. The certificates must be approved and signed by the server before configurations are exchanged.

OpenTofu Core Usage

OpenTofu is an open-source designed to help manage and automate cloud infrastructure with code.

Provider

Provider connects configuration code to the actual cloud platform or service that the user is trying to manage. OpenTofu talks to services like AWS, Azure, and GCP using APIs.

Resources

Resources are the specific pieces of infrastructure user wants to create or manage, such as virtual machine, a firewall rule, or a storage bucket. OpenTofu resources focus on provisioning and configuring cloud services from the ground up.

State

The state is how OpenTofu keeps track of what is already been created in the environment.

Unattended Deployment

It is the automation of installation and initial configuration of systems to avoid manual step-by-step administrations.

Kickstart

Kickstart is commonly used in traditional data center environments with RHEL-based systems. You automate the RHEL-based installation by specifying things like language, disk setup, network settings, package selection in configuration file. The general syntax to start a kickstart install from a boot prompt is linux ks=<LOCATION OF KICKSTART FILE> inst.repo=<INSTALLATION SOURCE>

# to start a kickstart install
linux ks=http://192.168.10.10/kickstart/ks.cfg inst.repo=http://192.168.10.10/rhel8

Cloud-init

Cloud-init is the standard for automating deployments in cloud platforms like AWS, Azure, or OpenStack. It reads a YAML configuration file in order to apply the changes during the fire boot of a cloud instance.

ex:

# to create an install and configure using cloud-init
aws ec2 run-instances --image-id ami-0adfads185141422356 --instance-type t2.micro --user-data file://init-script.yaml

CI/CD Concepts

CI/CD is a system of tools and practices that brings order and automation to modern software development.

Version Control

Version control is a system that tracks changes to files over time, allowing developers to collaborate, review history, and roll back if something goes wrong. Git is the most common version control tool used today.

Pipelines

A pipeline is a sequence of automated steps that take code from commit to deployment. It might include testing, security scanning, building the software, deploying to production.

Modern CI/CD Approaches

Shift Left Testing

Shift left testing moves testing earlier in the development cycle, right alongside coding. The common tools used are Jenkins and GitLab CI.

DevSecOps

DevSecOps = Development, Security, and Operations. It is an approach that builds on CI/CD by embedding security practices throughout the software lifecycle.

GitOps

GitOps is a way of managing infrastructure and deployments using Git as the single source of truth. Common tools used are Argo CD and Flux.

Kubernetes Core Workloads for Deployment Orchestration

Kubernetes is an open-source platform that automates the deployment, scaling and management of containerized applications.

Pods

Pods are where the applications run. They allow users to tightly couple containers that need to work together. Containers that run in the same pods can talk to each other like they are running in the same machine.

Deployments

Deployments make sure the right number of Pods are up and are kept up to date. A deployment acts like a controller that keeps track of the application and ensures the right number of Pods are always running and up to date.

Services

Services ensure the application is reachable by other apps or users. They provide stable endpoint so other applications or users can reliably connect to the app regardless of which Pod is curring running.

Kubernetes Configuration

Variables

Variables are the simplest way to pass configuration settings into the containers.

ex:

# to tell the pod to use production settings
ENVIRONMENT=production
ConfigMaps

ConfigMaps store larger sets of configuration data in a Kubernetes object.

Secrets

Secrets work similarly to ConfigMaps, but are specifically designed to store sensitive data such as passwords, API tokens, SSH keys, or SSL/TLS certificates. The defaults, secrets are encoded in base64. Kubernetes uses RBAC to control access to these secrets.

Volumes

Volumes provide a way for containers to store and access data that needs to persist beyond the life of a single container.

Docker Swarm Core Workloads for Deployment Orchestration

Docker Swarm is a tool that helps orchestrate container deployments, making sure everything runs reliably.

Nodes

A node is a physical or virtual machine that is part of the swarm cluster. A node runs the Docker engine and is classified as Manager Node which make decisions and assign tasks, or Worker Node which carry out the tasks.

Tasks

A task is the actual instance of a container running on a node. Each task maps to exactly one container, and Swarm monitors them all continuously. Pods can host multiple containers, while a swarm task maps one-to-one. Tasks help ensure that the application stays running as expected.

Service

A service is a top-level object in Docker Swarm that defines how the application runs.

Docker Swarm Configuration

Networks

Networks defines how containers communicate within the swarm.

Overlay Networks

Overlay Networks are virtual networks that span across all nodes in the swarm. They enable secure, seamless communication between containers on different nodes.

ex docker-compose.yaml

# define an overlay network called frontend.

networks:

  frontend:

    driver: overlay

Scaling

Scaling refers to how many replicas of a service are running at any given time.

ex docker-compose.yaml:

services:

  web:

    image:nginx

    deploy:

      replicas: 3

Docker/Podman Compose for Deployment

Compose file

docker-compose.yaml

version: "3.8"

services:
  web:
    image:nginx
    ports:
      - "8080:80"

  app:
    image:my-web-app:latest
    environment:
      - ENV=production
    depends_on:
      - db

  db:
    image:postgres
    volumes:
      - db-data:/var/lib/postgresql/data

volumes:
  db-data:

Podman Compose is more a security-focused container engine.

up and down commands

# to start or bring down containers
docker-compose up
docker-compose down
# to start or bring down containers
podman-compose up
podman-compose down

Viewing Logs

Viewing logs is essential for understanding what's happening in the application.

# to view logs
docker-compose logs
podman-compose logs
# to tail logs
docker-compose logs --follow web

Linux: OS Hardening

sudo Configuration

sudo = Super User Do

sudo allows regular users to execute commands with elevated root privileges.

visudo

visudo is the tool used to edit the sudo configuration. The main configuration is at /etc/sudoers. Additional configuration files are stored in /etc/sudoers.d. It is highly recommended to use visudo because it prevents saving invalid configuration.

  • -c checks syntax of sudoers file without editing

  • -f <CONFIG FILE PATH> specifies a different file to edit with visudo

  • -s runs the editor in strict mode

  • -x exports the sudoers file in JSON format for automation or auditing

wheel group

The wheel group is a special group commonly used in Linux systems to grant its members permission to run administrative command with sudo. ex sudo usermod -aG wheel john

/etc/sudoers

/etc/sudoers is the main config file accessed by visudo.

  • admin ALL=(ALL) ALL give users in the admin group full access to the system

  • john ALL=(ALL) /usr/bin/systemctl restart nginx gives user john permission to run only systemctl restart nginx

/etc/sudoers.d

/etc/sudoers.d allows admins to break up sudo configurations into multiple smaller files. That is helpful in enterprise environments with different levels of access needs, supporting automation tools like Ansible or Puppet.

sudoers Directives

  • NOPASSWD allows users to run sudo commands without being prompted for their password. It should be used with care in production environment to prevent accidental misuse.
# to not prompt tom for a sudo password
tom ALL=(ALL) NOPASSWD:/usr/bin/systemctl restart nginx
  • NOEXEC prevents the approved user sudo commands from launching additional programs or subshells from within a sudo-allowed command. NOEXEC restricts behavior by disabling the ability of the command to spawn other processes.
# to not prompt tom for a sudo password
kyle ALL=(ALL) NOEXEC:/usr/bin/less /var/log/syslog

sudo User Groups

  • sudo group: used in Debian-based systems like Ubuntu

  • wheel group: used in RHEL-based systems

sudo Group

# to add a user to the sudo group
sudo usermod -aG sudo amy

wheel Group

It serves the same purpose has the sudo group in RHEL-based systems.

# to add a user to the wheel group
sudo usermod -aG wheel amy

sudo -i

sudo -i opens a full root shell for users with sudo privileges from the sudo or wheel group. If the sudo config contains %sudo ALL=(ALL:ALL) ALL or %wheel ALL=(ALL:ALL) ALL any user in those group can elevate their permission to full root user with sudo -i.

Root Shell

sudo su - switch to the root user role entirely. su stands for substitute user

File Attributes

File attributes provide an extra layer of control that goes beyond the standard file permissions.

lsattr

lsattr allows viewing a file's current attributes.

# to view a file attributes
lsattr my-important-file.txt

Useful options:

  • -R list attributes recursively in subdirectories

  • -a includes hidden files

  • -d shows directory attributes instead of their contents

  • -v shows the version number of the file if supported

# to view attributes of all files in a directory including hidden files
lsattr -a -d -R /etc/config

Output may look like:

----i--------e-- /etc/config/setting.conf
-----a--------e-- /etc/config/logs.log

The i indicate that the file is immutable.

The a indicate that the file is append only.

chattr

chattr allows changing a file's attributes.

Common options include:

  • -R to apply changes recursively to directories and contents

  • -v to work with file's versioning

  • +i to set a file as immutable

  • -i to remove a file from immutable mode

# to protect a script from being modified or deleted
chattr +i /usr/local/bin/m_script.sh

# to remove immutable protection
chattr -i /usr/local/bin/m_script.sh

# to protect a directory from being modified or deleted
chattr -R +i /usr/local/bin/scripts/

File Permissions

chown

chown is used to change the ownership of a file and can also change its associated group at the same time. The general syntax is chown [OPTIONS] <NEW OWNER>[:NEW GROUP] file. -R option applies the change recursively to all files and subdirectories.

# to change the owner and group of a folder
chown -R tyler:accounting /data/reports

chgrp

chgrp focusses specifically on changing the group ownership of a file without affecting the user ownership. The general syntax is chgrp [OPTIONS] <NEW GROUP> file.

# to change the group ownership of a directory
chgrp -R admins /scripts

File Modifications

chmod (change mode)

chmod allows changing file permissions by specifying user class and permission ot add, remove, or set. The general is chmod [who][operator][permission] file.

who can be:

  • u: user/owner
  • g: group
  • o: others
  • a: all

operator can be:

  • +: to add
  • -: to remove
  • =: to set

permission can be:

  • r: read
  • w: write
  • x: execute
# to give execute permission to a file
chmod u+x script.sh

chmod using Octal Notation

  • Read: 4
  • Write: 2
  • Execute: 1

The general syntax is chmod [mode] <FILE>.

# to give rwx permission to user, rx to group, and r to others
chmod 754 config.conf

File Special Permissions

Special permissions temporarily grant users additional access on certain conditions. There are mainly 3 special permissions: setuid, setgid, and the sticky bit.

setuid (Set User ID)

setuid allows a program to run with the privilege of the file's owner. The general syntax is chmod u+s <FILE>.

chmod u+s /scripts/run.sh

setgid

setgid is similar to setuid but focuses on group ownership. Files and subdirectories created inherit the group ownership of the directory. The general syntax is chmod g+s <FILE OR DIRECTORY>.

# to set the setgid of a folder
chmod g+s /data/

# to set of files and directories in a direct to belong to the same group
chgrp admins /shared/scripts && chmod g+s /shared/scripts

sticky bit

The sticky bit is a special permission used on shared directories to prevent deletion or renaming of files not owned. The general syntax of the sticky bit is chmod +t <DIRECTORY>.

# to make users be able to delete their own files only
chmod +t /shared

Default File Creation Mask

umask (User File Creation Mask)

The umask defines which permission bits should be masked out or removed from the system's default permissions when a new file or directory is created.

New files begin with a default permissions or 666 for user, group, and others. New directories start with the default permissions of 777.

umask 022 removes the write bit (value 2) from the group and others permission sets:

666 - 022 = 644 => rw-r-r for new files

777 - 022 = 755 => rwxr-xr-x for new directories

# to see the umask value
umask

# to remove all access for others and write access for the group
# 640 for new files and 750 for directories
umask 027

To make the change permanent, the command can be added to user's shell config in .bashrc or .profile.

Access Control Lists - ACLs

ACLs give more flexibility that file permission controls. It provides detailed file permission management by specifying unique access rights for individual users and groups beyond owner, group, and others. Two main commands are used to manage ACLs:

  • getfacl: to display current ACLs
  • setfacl: to modify or add new ACL entries

getfacl

getfacl is used to view ACL entries on files or directories. The general syntax is getfacl [OPTIONS] <FILE OR DIRECTORY>. -R allows displaying ACLs recursively.

ex:

# to view ACLs of directory and its content
getfacl -R data/

setfacl

setfacl is used to create or modify ACL entries, allowing admins to fine-tune file access. The general syntax is setfacl [OPTIONS] <PERMISSION> [FILE OR DIRECTORY].

Common options:

  • -m to modify or add an entry
  • -x to remove an entry
  • -R to apply changes recursively

ex:

# to give a user permission to rw a file via ACL
setfacl -m u:tom:rw config.conf

# to remove the ACL entry
setfacl -x u:tom config.conf

# to reset a file to use standard permissions
setfacl -b config.conf

# to set default ACL on a directory for new files to inherit
setfacl -d -m u:tyler:rw /data/reports

SELinux States

SELinux = Security-Enhanced Linux.

SELinux can be in one of the three states:

  • Disabled: No policy enforcement or logging
  • Permissive: No enforcement, but logs policy violations
  • Enforcing: Enforces policies and blocks unauthorized actions

SELinux - Disabled State

This mode is typically used for troubleshooting in extreme cases or when SELinux is not needed in a particular environment. SELinux configuration is located at /etc/selinux/config. Set SELINUX=disabled to disable SELinux. A reboot to required for the change to take effect.

# to view SELinux state
getenforce

SELinux - Permissive State

In this state, SELinux will log action violations that would have been blocked.

# to temporarily set SELinux to permissive without reboot
setenforce 0

The change can be made persistent after reboot by editing /etc/selinux/config with the value SELINUX=permissive.

SELinux - Enforcing State

This is the default and most secured state. It is ideal for production environments.

# to temporarily set SELinux to enforcing without reboot
setenforce 1

Update /etc/selinux/config and set SELINUX=enforcing to make the change persistent after reboot.

SELinux File Security Contexts

To work with SELinux File Security Contexts, Linux provides 3 commands:

ls -Z

ls -Z is used to display the current SELinux context for files and directories with and added column showing context label including SELinux user, role, type, and level. The User part represent the SELinux user identity, such as system_u for system processes or unconfined_u for users not strictly controlled by SELinux. The Role part defines permissions available to process or user in a context such as object_r for files and directories. The Type part is the most important part of the context describing object purpose and used by SELinux policies to grant or deny access.

restorecon

restorecon is used to restore the default context of a file or directory based on SELinux policy. The general context is restorecon [OPTIONS] <PATH>.

# to recursively restore files context
restorecon -Rv /var/www/html

chcon

chcon is used to allow manual changes to a file context when necessary. The general syntax is chcon [OPTIONS] <FILE>.

Common options include:

  • -u to set the user
  • -r to set the role
  • -t to set the type
# to manually label a file for webserver access
chcon -u system_u -r object_r -t httpd_sys_context_t index.html

# then check with 
ls -Z index.html

SELinux System-wide Configuration

getsebool

getsebool check the current status of SELinux booleans, which are special on/off switches that control how strict or flexible SELinux is in certain situations.

# to list all booleans
getsebool -a

# to view a selected boolean
getsebool antivirus_can_scan_system

# to see whether webserver is allowed to access user home directories
getsebool httpd_enable_homedirs

# to view whether FTP services are allowed to access users's home directories
getsebool ftp_home_dir

# to view whether Apache webserver can initiate outbound network connections
getsebool httpd_can_network_connect

# to see whether the Samba file sharing service can share users's home directories over the network
getsebool samba_enable_home_dirs

setsebool

setsebool is used to turn a specific boolean on or off, and optionally make that change permanent across reboot. The general syntax is setsebool [-P] <BOOLEAN NAME> on|off. -P makes the change persist across reboots.

# to allow webserver to serve content from user directories
setsebool -P httpd_enable_homedirs on

# to allow Apache to connect to network services
setsebool -P httpd_can_network_connect on

semanage

semanage is used for managing SELinux settings persistently, including booleans, port contexts, and file labels. The general syntax for working with booleans is semanage boolean -m --on|--off <BOOLEAN NAME>.

semanage boolean -m --on httpd_enable_homedirs

Port Contexts are used to allow a service such as web or mail server to operate on a non-default port.

File Labels are used to define how SELinux should treat specific files or directories on the system.

SELinux Logging and Troubleshooting

sealert

sealert reads SELinux audit logs and provides clear, human-readable summaries of what denied and why. The logs are usually stored in /var/log/audit/audit.log. The general syntax is sealert -a <LOG PATH>.

# to read review SELinux relater logs
sealert -a /var/log/audit/audit.log

audit2allow

audit2allow is a tool that helps generate new policy rules based on denials to resolve issues by safely expanding SELinux policy when appropriate. The basic syntax is audit2allow -a that analyzes all current logs in the system's audit log. Alternatively audit2allow -i <LOG PATH>.

SSHD Secure Authentication and Access Control

SSHD: Secure Shell Daemon

Key vs Password Authentication

Password authentication requires users to type their password every time they login.

Key based authentication uses a private/public key pair for secure login. Key-based can be enforced by setting PasswordAuthentication no in /etc/ssh/sshd_config, then restart the service for the change to take effect.

PermitRootLogin

This controls whether the root user can login via SSH. It is common to disable this feature to reduce potential attack vectors. To disable this feature, set PermitRootLogin no in sshd_config.

AllowUsers

AllowUsers restricts SSH login to specific users.

# to edit the config
nano /etc/ssh/sshd_config

# allow user to login via ssh and block everybody else
AllowUsers tom tyler jessica@ws1

AllowGroups

AllowGroups allows SSH login for members of a specific group.

# allow ssh login to members of selected groups only
AllowGroups sshusers

SSHD Secure Configuration and Usage

Disabling X11 Forwarding

X11 allows graphical applications to be run on a remote system but displayed on the local machine. To disable it set X11Forwarding no in /etc/ssh/sshd_config, then restart sshd service to apply the changes

SSH Tunneling

Routes traffic through an encrypted SSH connection for securely accessing internal web applications or databases not exposed to the internet.

Secure Remote Access

SSH Agent

The SSH agent is a daemon that stores decrypted private keys in memory to avoid retyping the passphrase for each server connection.

# to load keys into daemon
ssh-add ~/.ssh/id_rsa

SFTP with chroot

A key aspect of secure remote access focused on controlling user interaction with the system during file transfers. SFTP with chroot restricts filesystem access during encrypted file transfer using SFTP.

# to limit sftp users to their designated directories
Match Group sftpusers
  ChrootDirectory /home/sftp/%u
  ForceCommand internal-sftp
  X11Forwarding no
  AllowTcpForwarding no

fail2ban

fail2ban is a security tool used on Linux systems that automatically blocks IP addresses which show signs of suspicious activity.

Configuration

The main config is located in /etc/fail2ban/jail.conf. It is advised to no change that file but to make a copy then change the copy like /etc/fail2ban/jail.local.

Each section in the configuration is called a jail. Each jail correspond a specific service such as ssh.

ex:

# to monitor ssh fail login attempt and block if necessary
[sshd]
enabled = true
port    = ssh
filter  = sshd
logpath = /var/log/auth.log
maxretry = 3
bantime  = 10800

Restart the service to apply the changes with systemctl restart fail2ban.

Monitoring

The main log is located in /var/log/fail2ban.log

Triggering

Simulate triggering by simulating fail login attempts.

# to detect which IP address have been banned
fail2ban-client status sshd

Unblocking

fail2ban-client set [JAIL] unbanip [IP ADDRESS]

# to unblock an ip address
fail2ban-client set sshd unbanip 203.204.205.206

Avoid Unsecure Services

Telnet

Telnet sends everything in plain text without encryption. it has been replaced by SSH.

# to disable telnet
systemctl disable --now telnet.socket

# remove telnet package
dnf remove telnet

FTP

FTP transmit credentials and files in plain text. SFTP (Secure FTP using SSH) and FTPS (FTP over TLS) are secure alternatives.

# to disable ftp 
systemctl disable --now vsftpd

# remove the package
dnf remove vsftpd

TFTP

TFTP has no authentication and no encryption. SCP and SFTP are secure alternatives.

# to disable tftp
systemctl disable --now tftpd

# remove the package
apt remove tftp-hpa
dnf remove tftp-server

Disable Unused File Systems

Linux often comes with rarely used filesystems.

Disabling filesystem by disabling kernel modules

  • cramfs
  • hfs
  • udf

To disable them, edit or create a config file in /etc/modprobe.d/ and add line install cramfs /bin/false.

# /etc/modprobe.d/cramfs.conf
install cramfs /bin/false
# /etc/modprobe.d/hfs.conf
install hfs /bin/false
# /etc/modprobe.d/udf.conf
install udf /bin/false

Disabling filesystem by editing fstab

fstab tells Linux which file systems to automatically mount at boot time. Outdated entries in fstab file can create security risks. Disable unnecessary lines placing a # at the beginning.

Unnecessary SUID Permissions

SUID bit

SUID (Set User ID) bit is a special file permission that can be applied to executable files. Setting the SUID bit in the wrong executable can be a security risk.

# to look for SUID bit file: -rwsr-xr-x. The 's' indicates that SUID bit is set
ls -l 

SUID binaries

SUID binaries are the programs or executables that have the SUID bit set.

# to search entire root file system for all files with SUID bit set
find / -perm -4000 -type f 2>/dev/null

# remove the SUID bit
chmod u-s /usr/bin/rcp

Secure Boot

Secure Boot is a feature designed to prevent unauthorized or malicious code from running during the system's startup process.

UEFI

UEFI = Unified Extensible Firmware Interface. It is the modern replacement for the older BIOS system that was used for decades to initialize hardware and start the operating system.

Secure boot can be configured via the UEFI menu with F2, Del, or Esc.

Linux: Compliance and Audit

Detection and Response

Anti-malware tools

  • CalmAV: an open-source anti-malware option for Linux systems. Run clamscan -r /usr to scan each file against recent virus database.

  • Linux Malware Detect (LMD): LMD is built on ClamAV to automatically scan uploads for PHP backdoors and known malware families

  • rkhunter: used for rootkit detection. Run it with rkhunter --check to check for rootkits.

  • chkrootkit: is used to inspect the system for hidden binaries, suspicious configurations, and tampered libraries

Indicators of Compromise (IoCs)

IoCs are things bad actors leave behind such as unexpected processes, odd network connections, unauthorized file changes, and more.

# to hunt for brute force login attempts
grep -i 'failed password' /var/log/auth.log

# to see open ports
ss -tulnp

We can use specialized tools like YARA, auditd, and Wazuh to hunt for IoCs.

Vulnerability Categorization

Vulnerability categorization is the practice of systematically identifying and describing software flaws.

CVE: Common Vulnerabilities and Exposures

CVE-YYYY-NNNNN:

  • YYYY: Year of the CVE

  • NNNNN: Sequence number of the year

CVSS Common Vulnerability Scoring System

CVSS Categorization:

  • 0.0: None

  • 0.1-3.9: Low

  • 4.0-6.9 Medium

  • 7.0-8.9 High

  • 9.0.10.0 Critical

Vulnerability Management

Service Misconfigurations

A service misconfiguration occurs when a Linux daemon is left with unsafe defaults or overly permissive settings. For example:

  • leaving SSH configured to allow password-based root user login

  • binding critical services to all network interfaces (0.0.0.0) instead of localhost

Backporting Patches

Backporting patches is the process of taking security fixes from a newer version of a package and applying them to the older version running on the system.

Vulnerability Detection

Port Scanners

Port scanners detect open network ports and services running on those ports.

  • Nmap: run nmap -sS -sV 10.0.0.0/24 to process a stealth scan

  • Zenmap: GUI version of Nmap

Protocol analyzers (or packet sniffers)

Protocol analyzers allow deeper inspection of the data moving through these ports by capturing and examining the network traffic.

  • Wireshark: a GUI tool that offers advanced packet analysis for many network protocols

  • Tshark: The CLI version of Wireshark

  • tcpdump: useful for remote management. tcpdump -i eth0 port 443 -w capture.pcap

Standards and Audit

Center for Internet Security Benchmarks

CIS provide detailed, expert-developed best practices for configuring systems securely. From password policies to disabling unnecessary services. It gives a standardized way to protect systems and prove compliance.

OpenSCAP - Open Security Content Automation Protocol

OpenSCAP allows admins to scan systems for compliance, identify security gaps, and even apply fixes, all based on recognized standards. It is a free and open-source tool that uses SCAP content to scan systems and tell how secure they are.

File Integrity Verification

Signed Package Verification

Signed Package Verification helps confirmed that the package being installed originates from a trusted source and has not been modified since its publication.

Installed File Verification

Installed File Verification allows periodic checks to ensure that none of the system files have been changed unexpectedly.

File Integrity Tools

Rootkit Humber krhumter

rkhunter is a lightweight tool that scans systems for signs of rootkits, backdoors, and known vulnerabilities by comparing system files and settings against a database of suspicious patterns.

# to do an interactive check for issues
rkhunter --check

# to do a non-interactive check
rkhunter --check --skip-keypress

# to update the toolkit database up to date
rkhunter --update

It is common to schedule rkhunter as a cron job to scan system and alert admin if anything suspicious is found.

Advanced Intrusion Detection Environment (AIDE)

AIDE builds a baseline snapshot of selected files and directories, captures details, and then compares the system to the snapshot during regular scans.

# to init AIDE
aide --init

# to detect changes
aide --check

Data Destruction Overwriting Tools

Data destruction overwriting tools overwrite deleted data with random or specific patterns to prevent recovery, used in sensitive and enterprise environments.

shred

shred securely deletes individual files by overwriting them with random data multiple times.

# to destroy a file
shred -u -v -n 5 old_secrets.txt

dd if=/dev/urandom

This command is used to overwrite entire disks or partitions with random bits, preventing recovery of any previous contents.

# to overwrite a disk or partition with random bits
dd if=/dev/urandom of=/dev/sdc1 bs=1M status=progress

badblocks -W

badblocks is used to check disk errors, but in write mode it can also destroy data by repeatedly overwriting the disk with test patterns.

# to erase a device
badblocks -wsv /dev/sdc2

Cryptographic Data Destruction

cryptsetup with LUKS

cryptsetup is used to encrypt an entire disk or partition. It can also be used to permanently destroy encrypted data by simply erasing the encryption header or one or more keyslots. The keyslots stores and encrypted copy of the master key.

# to erase cryptsetup header
cryptsetup luksErase /dev/sdb

zuluCrypt

zuluCrypt is a GUI and CLI tool for managing encrypted volumes. It supports LUKS volumes.

# to wipe en encrypted device 
zuluCrypt-cli -z -d /dev/sdb1

Software Supply Chain

GPG - GNU Privacy Guard

GPS ensures that software comes from a trusted source and hasn't been tampered with.

SBOM - Software Bill of Materials

SBOM is the detailed ingredient list for software, listing libraries, dependencies, and open-source components included in the application.

CI/CD

CI/CD is the process that automates how code is built, tested, and released, and it's the engine that keeps the modern software supply chain moving. Popular tools that is used in CI/CD are Jenknis GitLab CI, and GitHub Actions.

Security Banners

Tools used to show banners:

/etc/issue

/etc/issue show messages before login on local terminals

example of message: Authorized access only. This is Service 1 - Production

# to change the message
echo "Authorized access only!" | sudo tee /etc/issue

/etc/issue.net

/etc/issue.net show messages before login over remote access like SSH. It is often used for legal warnings o policy notices.

example of message: Warning: Unauthorized access to this system is prohibited and will be prosecuted.

/etc/motd`

/etc/motd (motd = message of the day) show messages after successful login. It is commonly used to communicate helpful information to users.

example of message: System maintenance scheduled for Saturday at 11 PM. Please save your work.

# to change the message
echo "System maintenance scheduled for Saturday at 11 PM. Please save your work." | sudo tee /etc/motd

Linux: Cryptography

Data at Rest - File Encryption

GNU Private Guard - GPG

GPG is a tool used to encrypt, decrypt, and digitally sign files using an asymmetric key

# to generate keys
# the keys are stored in ~/.gnupg
gpg --full-generate-key

# to encrypt a file using a asymmetric key
gpg --encrypt --recipient 'user@demo.com` secrets.txt

# to encrypt a file using a symmetric key
gpg -c secrets.txt

# to decrypt a file with a asymmetric key
gpg --decrypt secrets.txt.gpg

# decrypt a file with a symmetric key
gpg secrets.txt.gpg

# to digitally sign a file
gpg --sign secrets.txt

# to verify a signature
gpg --verify secrets.txt.gpg

A digital signature helps verify the identity of the sender and the integrity of the file.

Data at Rest - Filesystem Encryption

Linux Unified Key Setup version 2 - LUKS2

LUKS2 is a standardized, on-disk encryption container that wraps a filesystem in an impenetrable shell. argon2 is a lock mechanism used by LUKS2 to slow down attackers by requiring significant time and memory to test each passphrase. Argon2 has 3 variants: argon2i, argon2d, and argon2id.

# to install required tools
dnf install cryptsetup

# to check if a device contains a LUKS header
cryptsetup isLuks /dev/sdc2

# to add an extra passphrase or keyfile
cryptsetup luksAddKey /dev/sdc2 ./key.bin

# remove an existing
cryptsetup luksRemoveKey /dev/sdc2

# to view luks header
cryptsetup luksdump /dev/sdc2

# to encrypt a disk/partition
cryptsetup luksFormat --type luks2 /dev/sdc2

# to decrypt/open a disk/partition
# This will creates a mapped device in /dev/mapper/encrypted_disk.
cryptsetup luksOpen /dev/sdc2 encrypted_disk

# close the encrypted device
cryptsetup luksClose encrypted_disk

Data in Transit - OpenSSL

OpenSSL allows the creation and management of digital certificate and keys used to authenticate user identities.

TLS Certificate

TLS certificate is a like a digital passport for servers. It contains important information about the servers and is signed by a trusted Certificate Authority (CA).

# to generate a self signed certificate
openssl genpkey -algorithm RSA -out server.key

# to create a certificate signing request - CSR
openssl req -new -key server.key -out server.csr

s_client

s_client is used to probe any TLS-enabled service from the command line.

# to retrieve the server's certificate and verify the issuer, expiry date, and intermediates certs
openssl s_client -connect mail.example.com:993 -showcerts

Protection Methods

TLS Protocol Versions

TLS 1.2 and above are considered safe

LibreSSL

LibreSSL is a fork of the original OpenSSL library designed to be easier to audit and maintain.

# to install LibreSSL
dnf install libressl
WireGuard

WireGuard is a next gen VPN solution that operates inside the Linux kernel to secure entire network tunnels with modern cryptography.

Hashing

A hash function is a cryptographic hash algorithm that converts any size input into a fixed bit digest, ensuring data integrity by making it nearly impossible for two different inputs to produce the same output.

SHA-256

SHA-256 uses 256 bit digest

# to calculate the checksum of a file
sha256sum myfile,txt

Hash-based Message Authentication Code - HMAC

HMAC combines a secret key with SHA-256 to generate a keyed digest, allowing recipients who share the secret to verify both the integrity of the data and the authenticity of its sender.

# to calculate the hmac using SHA 256
openssl dgst -sha256 -hmac "secretkey" myfile.txt

Removal of Weak Algorithms

  • Defense Information Systems Agency Security Technical Implementation Guides (DISA STIGs). DISA STIGs sets the baseline for hardening Linux servers, and that includes explicitly turning off week cryptographic algorithms (disable legacy ciphers such as RC4, 3DES, prohibit MD5-based hashing)

  • FIPS 140-2 defines approved cryptographic modules and algorithms for federal systems, and FIPS compliance on a Linux machine ensures only approved algorithms are offered.

  • Disable SSHv1 sudo sed -i 's/^#*Protocol.*/Protocol 2/' /etc/ssh/sshd_config && sudo systemctl restart sshd

  • Use sslsanc to probe TLS-enable services and flag anything outdated.

Certificate Management

No Cost Trusted Root Certificate

  • Let's Encrypt: Free to use

Commercial Root Certificate Authorities

They charge fees in exchange for extended validation procedures, longer certificate lifetimes, insurance warranties, and hands-on support.

  • DigiCert

  • GlobalSign

  • Sectigo

  • Entrust

Linux: Hardening Techniques

Password Composition Controls

Password Complexity

Password complexity requires different characters types in a password, like uppercase, lowercase, digits, and symbols. it ensures that user passwords include a mix of character types, such as uppercase letters, lowercase letters, numbers, and special characters. Password complexity is managed through PAM. To the user password complexity, edit /etc/security/pwquality.conf

# password should include at least 4 characters of different categories
minclass = 4

Password Length

Password length sets the minimum number of characters required in a password. It is also configured through the pam_pwquality module.

# set minimum password length
minlen = 12

Password lifecycle Controls

Password lifecycle controls require users to change their passwords regularly.

Password Expiration

Password expiration forces users to change their passwords after a certain number of days. chage is used to control this setting per user account basis.

# To change user password max age
chage -M 90 samuel

Password History

Password History keeps track of old passwords to support Password reuse.

Password Reuse

Password reuse prevents users from reusing old passwords. pam_pwhistory tracks old passwords in order to block password reuse. To change the settings, edit /etc/pam.d/common-password.

# to prevent any user from reusing their last 5 passwords
password requisite pam_pwhistory.so remember=5

Checking existing breach lists

Have I Been Pwned - HIBP

Checks email addresses against known public breaches.

Have I Been Pwned haveibeenpwned.com

Via API:

# check for email in breach
https://haveibeenpwned.com/api/v3/breachedaccount/jdoe@email.com

DeHashed

DeHashed provides deeper insight with email, phone number, username, ip address, and document searches in breach data.

Via API:

# to search break data for a selected email address
https://api.dehashed.com/search?query=jdoe@email.com&size=20

Intelx.io

Intelx.io provides enterprise-grade OSINT solution, aggregating data from dark-web forums, paste sites, and public breache dumps with powerful query syntax and API access.

Restricted shell use

/sbin/nologin

/sbin/nologin prevents interactive login.

ex:

# to create a user withn o shell access. Go for automated services
useradd -s /sbin/nologin backupbot

/bin/rbash

/bin/rbash provides limited shell access to users. It restricts actions like changing directories, modifying environment variables, or executing programs rom unexpected locations.

ex:

# create a user with restricted bash shell
useradd -s /bin/rbash -m reports

pam_tally2

pam_tally2 helps monitor and respond to failed login attempts.

/etc/pam.d/common-auth

/etc/pam.d/login

# to lock account after 5 failed attempts
# and automatically unlock it after 10 minutes
auth required pam_tally2.so onerr=fail deny=5 unlock_time=600

pam_tally2 # to view a summary of all failed attempts

pam_tally2 --user john # to view a summary of all failed attempts for a selected user

Avoid Running as root user

sudoers

/etc/sudoers is edited using visudo to prevent errors.

# give the user access to restart Nginx and nothing more
john ALL=(ALL) /sbin/systemctl restart nginx

PolKit (PolicyKit)

pkexec runs a command as another user

pkaction lists available privileged operations on the system

pkcheck checks whether a user is authorized to perform a specific action

pkttyagent provides a prompt for authentication in a terminal session

Linux: Firewall

firewall Configuration and Management

Zones

A zone is a named profile that carries itw own rule set for which services and ports are allowed through the zone. To create a zone, run firewall-cmd --permanent --new-zone=<ZONE-NAME> && firewall-cmd --reload. Run firewall-cmd --get-zones to see all zones.

Runtime Settings

Runtime settings take effect immediately and stays active until the next reboot or manual reload.

Permanent Settings

Permanent settings persists across reboots but does not touch the running firewall until reload.

firewall-cmd

firewall-cmd is the command line tool used to manage firewalld configurations. The general syntax is firewall-cmd <OPTION> <OPTION VALUES>.

Useful options include:

  • --get-zones: display all zones
  • --get-active-zones: shows only zones that currently have bound interfaces
  • --list-all --zone=<ZONE>: displays every rule in a given zone
  • --add-port=<PORT>/<PROTOCOL>: opens individual ports
  • --remove-port=<PORT>/<PROTOCOL>: closes individual ports
  • --runtime-to-permanent: to copy current rule set to disk
  • --set-default-zone=<ZONE>: to change the default zone assigned to new interfaces
  • --zone=<ZONE> --change-interface=<INTERFACE>: to assign an interface to a zone

Rules and Access Control

Ports
# to add port
firewall-cmd --zone=internal --add-port=8080/tcp --permanent

# to remove a port
firewall-cmd --zone=internal--remove-port=8080/tcp --permanent
Services
# to add https service
firewall-cmd --zone=internal --add-service=https --permanent

# to remove https service
firewall-cmd --zone=internal --add-service=https --permanent
Rich Rules

Rich rules extend firewalld with "if-this-then-that" logic.

# to add rich rule to a zone
firewall-cmd --zone=public --permanent --add-rich-rule='rule family="ipv4" source address="192.168.1.10" service name="ssh" accept' --permanent

Uncomplicated Firewall - (UFW)

By default UFW blocks all incoming traffic and allows all outgoing traffic. It writes every change directly to its configuration file and loads it boot.

ufw enable # to enable UFW service

ufw disable # to disable UFW service

ufw allow 8080/tcp # to add a allow rule

ufw allow ssh # to add a allow rule

ufw deny 23 # to add a deny rule

ufw delete allow http # to delete a allow rule

ufw allow from 192.168.1.10 # to allow traffic from specific IP address

ufw deny from 192.168.1.10 # to deny traffic from specific IP address

ufw allow from 192.168.1.0/24 to any port 22 # to allow subnet to access specific port

ufw status numbered # to see numbered rule set

ufw delete 2 # to delete numbered rule set

ufw default deny incoming # to set default incoming (deny)

ufw default allow outgoing # to set default outgoing (allow)

iptables

iptables is a command line utility used for traffic filtering and alteration. It is build around tables. The main tables are:

  • filter
  • nat
  • mangle
  • raw
  • security

Each table contains Chains:

  • INPUT: inspects packets destined for local system
  • OUTPUT: filters packets originating from local system
  • FORWARD: filters packets moving through the system

The general syntax is iptables [-t <TABLES>] -A <CHAIN> -p <PROTOCOL> [MATCH OPTION] -j <TARGET>.

ex:

# to accept SSH traffic into a host
iptables -t filter -A INPUT -p tcp --dport 22 -j ACCEPT

ipset

ipset groups many ip addresses or subnets into sets to let servers match and process packets more efficiently than checking each one individually. The generic syntax is ipset [OPTIONS] <COMMAND> <SETNAME> [PARAMS]. Common commands include: create, add, del, list.

# to keep a dynamic deny list of know-bad ip addresses and tie it back into iptables

# create new set
ipset create bad_hosts_list hash:ip

# add offending ip address to set
ipset add bad_hosts_list 172.0.0.25

# view ipset list
ipset list bad_hosts_list

nftables

nftables is a single framework that merges tables and rules. It is a modern successor of iptables. The general syntax is nft [OPTIONS] add rule <FAMILY> <TABLE> <CHAIN> <EXPRESSION>

# to allow ssh on port 22
nft add rule inet filter input tcp dport 22 ct state new accept

Netfilter Module

The Netfilter module is a Linux kernel module that acts like the digital gatekeeper, examining every data packet entering or leaving the system. and deciding if a packet should be blocked or allowed according to the predefined rules. It is the backend for iptables, ip6tables, and nftables.

Stateful and Stateless Firewall

Stateless Firewall

A stateless firewall treats each incoming packet independently using pre-defined rules.

# accept http traffic
iptables -A INPUT -p tcp --dport 80 -j ACCEPT

# drop all other traffic
iptables -A INPUT -j DROP

Every packet will be checked to see if it is destined to port 80. If not the packet will be dropped.

Stateful Firewall

Stateful firewall remembers ongoing communications sessions between computers.

# Allow established and related packets
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Allow new ssh connections
iptables -A INPUT -p tcp --dport 22 -m state --state NEW -j ACCEPT

# Drop everything else
iptables -A INPUT -j DROP

IP Forwarding

IP forwarding allows system to pass network traffic from one interface to another, acting like a router. IP forwarding is disabled by default. Set ip forwarding permanently with net.ipv4.ip_forward = 1 in /etc/sysctl.conf. For temporary enabling ip forwarding, run sysctl -w net.ipv4.ip_forward=1.

Linux: Authorization, Authentication, and Accounting

Local Authentication

PAM (Pluggable Authentication Modules)

PAM handles the core authentication process: validating usernames, password, and enforcing policies. PAM relies on other modules to handle specific part of the authentication process. These modules are configured in files located in /etc/pam.d/ directory.

PAM module types

  • auth: verifies user identity
  • account: enforces access policies
  • password: handle password updates
  • session: manages tasks that happen at the start or end of a session

PAM uses controls flags to determine how each module's result should affect the overall outcome.

Module flags:

  • required: the module must pass, processing continues even if it fails
  • requisite: the module must pass, failure causes immediate termination
  • sufficient: success means authentication may succeed early if no required module failed
  • optional: only evaluated if it's the only module in the group

Polkit (PolicyKit)

Polkit manages authorization: deciding if regular users can perform administrative or system-level actions without switching to root. The rules are configured in files in /etc/polkit-1/rules.d/ or /etc/polkit-1/localauthority/ directories.

Directory-based Identity Management

Kerberos

Kerberos handles secure authentication using a ticket-based system to prove identity without repeatedly sending passwords. It is a secured network authentication protocol that allows users and services to prove their identity without sending passwords over the network.

LDAP (Lightweight Directory Access Protocol)

LDAP provides a structured directory for storing user accounts, groups memberships, and organizational information. It is a standardized protocol used to access and manage directory information. It is where the usernames, group definitions, and user attributes are stored.

SSSD (System Security Service Daemon) and Winbind

SSSD and Winbind act as intermediary on Linux for connection and using these centralized services seamlessly.

Network / Domain Integration

realm

realm is a tool that simplifies the process of joining systems to domains and sets up authentication with minimal manual configuration. realm enables identity and login integration with Windows domains, but it doe snot handle file or printer sharing.

ex:

realm discover my.domain.com # to discover domains

realm join --user=admin my.corporation.com # to join my.corporation.com domain using the admin credentials

realm list # verify the configurations

realm permit --all # to permit all users to login

realm permit admin@my.domain.com # to allow specific user to login

realm permit -g "Administrators" # to allow a group to login

realm leave my.domain.com # to leave a domain

Samba

Samba provides a deeper integration with Windows environments. It is focussed on file sharing, printer access, and Windows-compatible network services. The main configuration is located in /etc/samba/smb.conf

example file share:

[global]
  workgroup = WORKGROUP
  server string = Samba Server
  security = user

[Public]
  path = /srv/samba/public
  browsable = yes
  writable = yes
  guest ok = yes
udo systemctl start smb nmb # to start samba service

sudo systemctl enable smb nmb # tp enable samba a system start

Logging

/var/log

/var/log is the central directory on most Linux systems where log files are stored.

  • messages General system messages
  • /var/log/syslog System-wide log
  • /var/log/kern.log Kernel-specific messages
  • /var/log/auth.log / /var/log/secure Authentication and authorization events
  • /var/log/boot.log Boot process messages
  • /var/log/dmesg Kernel ring buffer messages
  • /var/log/cron Cron job execution logs
  • /var/log/maillog / /var/log/mail.log Mail server logs
  • /var/log/Xorg.0.log / /var/log/X server graphical session logs
  • /var/log/apt/ / /var/log/yum/ Package manager logs
  • /var/log/journal/ Systemd journal storage

rsyslog

rsyslog is a high-performance logging service that receives and stores log messages from the kernel, services, and applications. The configurations are stored in /etc/rsyslog.conf and /etc/rsyslog.d/*.conf

ex:

auth.* /var/log/auth.log # to store all authentication messages to a file

kern.warning /var/log/kern.log # to log only kernel warning messages and above

*.* @@log.server.com:514 # to send log messages to a remote server

Message severity levels

emerg # system unusable

alert # immediate action required

crit # critical conditions

err # errors

warning # warnings

notice # normal but significant

info # informational messages

debug # debug messages

journalctl

journalctl is a systemd tool used to view messages store by systemd journal.

journalctl -b # to view all logs for the current boot

journalctl -b -1 # to view all logs for the previous boot

journalctl -f # to tail log

journalctl -k # to view logs from the kernel

journalctl -u nginx.service # to view logs nginx

logrotate

logrotate is a tool for managing the size and rotation of log files, ensuring that logs do not fill up the disk over time. The main configuration is located in /etc/logrotate.conf and /etc/logrotate.d/.

ex:

logrotate -d /etc/logrotate.conf # to check configuration

logrotate -f /etc/logrotate.conf # to force log rotation

System Audit

auditd

auditd is a service that records audit events to disk, and administrators control i witht he systemd utility.

audit.rules

audit.rules is the configuration file that tells the audit subsystem precisely which activity to record. The configuration is located in /etc/audit/rules.d/audit.rules.

ex:

-w /etc/passwd -p wa -k passwd_changes # to tell the audit system to watch for password changes

-w /var/log/lastlog -p wa -k login_logs # to watch for user login

-w /var/run/faillock -p wa -k failed_logins # to watch for failed logins

ausearch -k passwd_changes # to search logs for keys