String-Manipulation

STRING MANIPULATION : Play with strings

strings : used to extract readable text strings from binary files. When you run strings on a file, it scans through the binary data looking for sequences of printable characters and displays them. This is often useful for extracting human-readable information from compiled executables or other binary files.
```
strings data.txt
```

env : Environment variables in Linux are dynamic values that can affect the behavior of processes and programs running in the operating system.

# display env variables

$ env

$ echo $PATH

/usr/local/sbin:/usr/local/bin:/usr/sbin:/bin

# This returns a list of paths separated by a colon that your system searches when it runs a command.

# Modify the path variable 

export PATH=$PATH:/path/to/new/directory

cut : It is used for cutting out sections from each line of a file or from piped data.

c, --characters=LIST: Select only these characters.
f, --fields=LIST: Select only these fields.
d, --delimiter=DELIM: Use DELIM instead of TAB for field delimiter.

# Extract 5th character from each line
cut -c 5 sample.txt

# Extract characters 3-7 and 12-16 from each line of a file:
cut -c 3-7,12-16 filename.txt

# Extract the first and third fields (using a comma as the delimiter) from a file
# By default it 'd' is TAB
cut -f 1,3 -d , filename.csv

# Extract the second field (using a comma as the delimiter) from 
# the output of another command:
echo "John,Doe,25" | cut -f 2 -d ,

head : By default the head command will show you the first 10 lines in a file.
- The -n flag stands for number of lines.
```
head -n 15 /var/log/syslog # diaplay first 15 lines
head -c 15 /var/log/syslog # display first 15 characters
```

tail : It is similar to head command as it prints the last 10 lines of a file

tail -3 filename # display last 3 lines 
tail +3 filename # skips first 2 lines 

paste : The paste command is similar to the cat command, it merges lines together in a file.

EXAMPLES

# Display the first 5 lines (head) and the last 5 lines (tail) of a file
paste <(head -n 5 filename.txt) <(tail -n 5 filename.txt)

# Display the first 3 lines (head) and the last 3 lines (tail) of the ls command
paste <(ls | head -n 3) <(ls | tail -n 3)

expand : To converts your TABs to spaces, use the expand command.(it doesn’t remove space).

unexpand : Opposite of expand

expand sample.txt > result.txt
unexpand -a result.txt

tr -s '\t' ' ' < input.txt > output.txt #It will reove tabs & make a single space
tr -s '\n' ' ' < sample.txt # This change the newline into a space 
tr -s ';' ' ' < sample.txt # This will change semicolon to space

**NOTE- This will not change them temporarily, you need to save the output**

sort : sort a file
- -r, --reverse: Reverse the result of comparisons.
- -n, --numeric-sort: Sort numerically.
- -u, --unique: Output only unique lines.
```
sort filename.txt

# sort the file and '-m' merge the file 
sort -m file1.txt file2.txt file3.txt > sorted_output.txt
```

tr : The tr (translate) command allows you to translate a set of characters into another set of characters.

tr a-z A-Z # coverts the lowecase to uppercase

hello

HELLO

echo "123abc456" | tr -d '0-9' # Delete digits from string

split : This will split it into different files, by default it will split them once they reach a 1000 line limit.

# his splits input.txt into files each containing 100 lines.
split -l 100 input.txt

-l NUMBER: Split the file into chunks of NUMBER lines.
-b SIZE[K|M|G]: Split the file into chunks of specified size.

join : The join command allows you to join multiple files together by a common field:
```
$ join -1 2 -2 1 file1.txt file2.txt

1 John Doe

2 Jane Doe

3 Mary Sue
```

uniq : The uniq (unique) command is another useful tool for parsing text.

uniq reading.txt # remove duplicates

uniq -c reading.txt # get the count of how many occurrences of a line

uniq -u reading.txt # get unique values (occur only one time)

uniq -d reading.txt # get duplicate values

**Note** : uniq does not detect duplicate lines unless they are adjacent.

# To overcome this limitation of uniq we can use sort in combination with uniq:
sort reading.txt | uniq

wc : The wc command stands for "word count" and is used to count the number of lines, words, and characters in a file.
```
wc myfile.txt

-l: Count lines.
-w: Count words.
-c: Count characters.
```
nl : used to number lines in a file
```
$ nl file1.txt

1. i
2. like
3. turtles
```

Additional resource

You can read the blog to get a complete guide to master Linux.

Read Blog

Additional resource​

Additional resource