String-Manipulation
- STRING MANIPULATION : Play with strings
-
strings
: used to extract readable text strings from binary files. When you runstrings
on a file, it scans through the binary data looking for sequences of printable characters and displays them. This is often useful for extracting human-readable information from compiled executables or other binary files.strings data.txt
-
env
: Environment variables in Linux are dynamic values that can affect the behavior of processes and programs running in the operating system.# display env variables
$ env
$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/bin
# This returns a list of paths separated by a colon that your system searches when it runs a command.
# Modify the path variable
export PATH=$PATH:/path/to/new/directory -
cut
: It is used for cutting out sections from each line of a file or from piped data.c, --characters=LIST
: Select only these characters.f, --fields=LIST
: Select only these fields.d, --delimiter=DELIM
: Use DELIM instead of TAB for field delimiter.
# Extract 5th character from each line
cut -c 5 sample.txt
# Extract characters 3-7 and 12-16 from each line of a file:
cut -c 3-7,12-16 filename.txt
# Extract the first and third fields (using a comma as the delimiter) from a file
# By default it 'd' is TAB
cut -f 1,3 -d , filename.csv
# Extract the second field (using a comma as the delimiter) from
# the output of another command:
echo "John,Doe,25" | cut -f 2 -d , -
head
: By default the head command will show you the first 10 lines in a file.- The -n flag stands for number of lines.
head -n 15 /var/log/syslog # diaplay first 15 lines
head -c 15 /var/log/syslog # display first 15 characters -
tail
: It is similar to head command as it prints the last 10 lines of a filetail -3 filename # display last 3 lines
tail +3 filename # skips first 2 lines -
paste
: The paste command is similar to the cat command, it merges lines together in a file.- EXAMPLES
# Display the first 5 lines (head) and the last 5 lines (tail) of a file
paste <(head -n 5 filename.txt) <(tail -n 5 filename.txt)
# Display the first 3 lines (head) and the last 3 lines (tail) of the ls command
paste <(ls | head -n 3) <(ls | tail -n 3) -
expand
: To converts your TABs to spaces, use the expand command.(it doesn’t remove space). -
unexpand
: Opposite of expandexpand sample.txt > result.txt
unexpand -a result.txt
tr -s '\t' ' ' < input.txt > output.txt #It will reove tabs & make a single space
tr -s '\n' ' ' < sample.txt # This change the newline into a space
tr -s ';' ' ' < sample.txt # This will change semicolon to space
**NOTE- This will not change them temporarily, you need to save the output** -
sort
: sort a file-r, --reverse
: Reverse the result of comparisons.-n, --numeric-sort
: Sort numerically.-u, --unique
: Output only unique lines.
sort filename.txt
# sort the file and '-m' merge the file
sort -m file1.txt file2.txt file3.txt > sorted_output.txt -
tr
: The tr (translate) command allows you to translate a set of characters into another set of characters.tr a-z A-Z # coverts the lowecase to uppercase
hello
HELLO
echo "123abc456" | tr -d '0-9' # Delete digits from string -
split
: This will split it into different files, by default it will split them once they reach a 1000 line limit.# his splits input.txt into files each containing 100 lines.
split -l 100 input.txt
-l NUMBER: Split the file into chunks of NUMBER lines.
-b SIZE[K|M|G]: Split the file into chunks of specified size. -
join
: The join command allows you to join multiple files together by a common field:$ join -1 2 -2 1 file1.txt file2.txt
1 John Doe
2 Jane Doe
3 Mary Sue -
uniq
: Theuniq
(unique) command is another useful tool for parsing text.uniq reading.txt # remove duplicates
uniq -c reading.txt # get the count of how many occurrences of a line
uniq -u reading.txt # get unique values (occur only one time)
uniq -d reading.txt # get duplicate values
**Note** : uniq does not detect duplicate lines unless they are adjacent.
# To overcome this limitation of uniq we can use sort in combination with uniq:
sort reading.txt | uniq -
wc
: Thewc
command stands for "word count" and is used to count the number of lines, words, and characters in a file.wc myfile.txt
-l: Count lines.
-w: Count words.
-c: Count characters. -
nl
: used to number lines in a file$ nl file1.txt
1. i
2. like
3. turtles
-
Additional resource
You can read the blog to get a complete guide to master Linux.