Grep matches only of multiple separated strings
By : Torben Scherzer
Date : March 29 2020, 07:55 AM
will be helpful for those in need I have a file with lines containing this format: , Is a sed solution acceptable? code :
sed 's/^\([^ ]* [^ ]*\).*\(fieldD=[^,]*\).*/\1 \2/' filename
|
grep using a list to find matches in a file, and print only the first occurrence for each string in the list
By : LightDragooon
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I have a file, for example, "queries.txt" that has hard return separated strings. I want to use this list to find matches in a second file, "biglist.txt". , If you want to "reset the counter" after each file, you could do code :
cat queries.txt | xargs -I{} grep -m 1 -w {} biglist.txt > output
cat queries.txt - produce one "search word" per line
xargs -I{} - take the input one line at a time, and insert it at {}
grep -m 1 -w - find only one match of a whole word
{} - this is where xargs inserts the search term (once per call)
biglist.txt - the file to be searched
> output - the file where the result is to be written
|
How to save the lines of grep matches?
By : Romulo
Date : March 29 2020, 07:55 AM
With these it helps If I understand your question well, you only need the first occurrence of each filename. You can achieve this using awk: code :
awk '!x[$2]++' file.txt
INPUT hello.txt
OUTPUT stack.txt
INPUT overflow.txt
OUTPUT byebye.txt
INPUT nick.txt
OUTPUT jesus.txt
|
How to grep exact matches from a file of a list of strings
By : Badri
Date : March 29 2020, 07:55 AM
Hope that helps I have a file A with one column with a list of strings like this: , You can use awk instead: code :
awk 'FNR==NR{a[$1];next} ($4 in a)' A B
chr13 50571142 50592603 ADAMTS9 21461 +
chr19 50180408 50191707 AIP 11299 +
awk 'FNR==NR{a[$1];next} {for (i=1; i<=NF; i++) if ($i in a) print}' A B
|
Grep across multiple lines but returning all matches
By : knappetroll
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further awk should be able to do what you want. I think perl and sed probably would too for that matter. Using the -E (extended regex) I believe is making your regex too greedy. As far as why your grep -P is not working, you'll have to use grep --version and grep--help and do some research. Mine is working fine with GNU grep 2.22 on Ubuntu 16.04. code :
awk 'BEGIN {ln=1; lck="n"; print "---"};
lck=="y" {print ln")",$0};
$3=="UNLOCK" {lck="n"; ln++; print "---"; next};
$3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
{ln++};
' NEWSJBHQDB12A.log > NEWSJBHQDB12A_filtered.txt;
$ cat file
2302221 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
2302221 Query SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182716 AND finishAtom > 1182716
2302221 Query INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182716,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274545,274588,315392,0,0,0,1)
2302221 Query UPDATE browse_being_allocated SET startAtom = 1182717 WHERE id = 26471948
2302221 Query UNLOCK TABLES
2334151 Change user user@dbsrv1 on db
2334151 Query SET NAMES utf8
2334151 Query SET character_set_results = NULL
2334151 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
2302201 Change user user@dbsrv1 on db
2302201 Query SET NAMES utf8
2302201 Query SET character_set_results = NULL
2302201 Query SELECT DISTINCT rushID FROM tags WHERE rushID NOT IN (SELECT DISTINCT rushID FROM essencefragments) GROUP BY rushID 151216 19:00:39
2566722 Quit
2522564 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
2522564 Query SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182717 AND finishAtom > 1182717
2522564 Query INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182717,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274588,274633,331776,0,0,0,1)
2522564 Query UPDATE browse_being_allocated SET startAtom = 1182718 WHERE id = 26471948
2522564 Query UNLOCK TABLES
$ awk 'BEGIN {ln=1; lck="n"; print "---"};
lck=="y" {print ln")",$0};
$3=="UNLOCK" {lck="n"; ln++; print "---"; next};
$3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
{ln++};
' file
---
1) 2302221 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
2) 2302221 Query SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182716 AND finishAtom > 1182716
3) 2302221 Query INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182716,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274545,274588,315392,0,0,0,1)
4) 2302221 Query UPDATE browse_being_allocated SET startAtom = 1182717 WHERE id = 26471948
5) 2302221 Query UNLOCK TABLES
---
9) 2334151 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
10) 2302201 Change user user@dbsrv1 on db
11) 2302201 Query SET NAMES utf8
12) 2302201 Query SET character_set_results = NULL
13) 2302201 Query SELECT DISTINCT rushID FROM tags WHERE rushID NOT IN (SELECT DISTINCT rushID FROM essencefragments) GROUP BY rushID 151216 19:00:39
14) 2566722 Quit
15) 2522564 Query LOCK TABLES browse WRITE, browse_being_allocated WRITE
16) 2522564 Query SELECT id,startAtom,finishAtom FROM browse_being_allocated WHERE poolID = 31543 AND rushID = '32ca680dd0d84f9b9b2945e2186c09ff' AND format = 516 AND startAtom <= 1182717 AND finishAtom > 1182717
17) 2522564 Query INSERT INTO browse (poolId,atom,skew,format,rushID,start,finish,databytes,srcPoolID,srcAtom,srcSkew,arrived) VALUES (31543,1182717,0,516,'32ca680dd0d84f9b9b2945e2186c09ff',274588,274633,331776,0,0,0,1)
18) 2522564 Query UPDATE browse_being_allocated SET startAtom = 1182718 WHERE id = 26471948
19) 2522564 Query UNLOCK TABLES
---
$ cat file
NO PRINT
NO PRINT
1 Query LOCK
STUFF
STUFF
STUFF
1 Query UNLOCK
NO PRINT
2 Query LOCK
STUFF
2 Query UNLOCK
NO PRINT
NO PRINT
NO PRINT
NO PRINT
$ awk 'BEGIN {ln=1; lck="n"; print "---"};
lck=="y" {print ln")",$0};
$3=="UNLOCK" {lck="n"; ln++; print "---"; next};
$3=="LOCK" && lck=="n" {print ln")",$0; lck="y";ln++; next};
{ln++};
' file
---
3) 1 Query LOCK
4) STUFF
5) STUFF
6) STUFF
7) 1 Query UNLOCK
---
9) 2 Query LOCK
10) STUFF
11) 2 Query UNLOCK
---
|