Home

Thursday, May 6, 2021

grep for DBA's

Many times I'll want to inspect a file and because I'm a DBA, my first inclination is to load the file into a database table so that I can write SQL queries against it.  However, it's not always possible or timely to load the data into a database server to inspect a file.  I've put together some useful grep commands below that I run to get a sense of the data attributes about the file.

For pretend purposes, let say we have a file named Person.csv that contains records about people.  The file contains an Id, FirstName, LastName, DateRecordCreated and DateRecordUpdated columns.  See below.







To count the number of instances of "2021" in a file.

$ grep -c -i 2021 Person.csv

Output:

4


To search for the existence of a pattern in a file

$ grep -i '2021' Person.csv

Output:

1,Bart,Simpson,1990-01-01,2021-01-01

3,Seth,Rogen,2000-07-01,2021-04-30

6,Bill,Gates,1966-05-04,2021-05-01

8,Drew,Brees,1999-02-15,2021-02-01


To obtain the line number(s) of a pattern in a file

$ grep -n -i 2021 Person.csv

Output:

2:1,Bart,Simpson,1990-01-01,2021-01-01

4:3,Seth,Rogen,2000-07-01,2021-04-30

7:6,Bill,Gates,1966-05-04,2021-05-01

9:8,Drew,Brees,1999-02-15,2021-02-01


To count the number of non pattern matches

$ grep -v -c 2021 Person.csv

Output:

6


Count number of instances of multiple pattern

$ grep -ioh "2018\|2019\|2020\|2021" Person.csv | sort | uniq -c | sort -n

Output:

1 2020

2 2019

4 2021


Count number of lines in a file

$ grep -c ^ Person.csv

Output:

10

Percentage of Full Table Scans on MYSQL

Below is a useful script I use to determine what percentage of my queries in MYSQL are performing full table scans.  Limiting full table sca...