1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
|
.Dd 13 November, 2024
.Dt GRAB 1
.Os Grab 3.0.0
.Sh NAME
.Nm grab ,
.Nm "git grab"
.Nd search for patterns in files
.Sh SYNOPSIS
.Nm
.Op Fl H Ar never | multi | always
.Op Fl bcilLpsUz
.Ar pattern
.Op Ar
.Nm
.Fl h
.Pp
.Nm "git grab"
.Op Fl H Ar never | multi | always
.Op Fl bcilLpsUz
.Ar pattern
.Op Ar "glob ..."
.Nm "git grab"
.Fl h
.Sh DESCRIPTION
The
.Nm
utility searches for text matching the given
.Ar pattern
in the files listed on the command-line,
printing the matches to the standard-output.
Unlike the
.Xr grep 1
utility,
.Nm
is not strictly line-oriented;
the structure of matches is left up to the user to define.
For more details on the pattern syntax, see
.Sx Pattern Syntax .
.Pp
The
.Nm "git grab"
utility is identical to the
.Nm
utility except that it takes globs matching files as command-line
arguments instead of files,
and processes all non-binary files in the current git repository that
match the provided globs.
If no globs are provided,
all non-binary files in the current git repository are processed.
.Pp
.Nm
will read from the files provided on the command-line.
If no files are provided, the standard-input will be read instead.
The special filename
.Sq \-
can also be provided,
which represents the standard-input.
.Pp
Similar to the
.Xr grep 1
utility matches are printed to the standard output.
They are additionally prefixed with the name of the file in which
.Ar pattern
was matched, as well as the location of the match.
.Pp
The options are as follows:
.Bl -tag -width Ds
.It Fl b , Fl Fl byte\-offset
Report the positions of pattern matches as the (zero-based) byte offset
of the match from the beginning of the file.
.Pp
This option is useful if your text editor
.Pq such as Xr vim 1 or Xr emacs 1
supports jumping directly to a given byte offset/position.
.Pp
This is the default behaviour if the
.Fl L
option is not provided.
.It Fl c , Fl Fl color
Force colored output,
even if the output device is not a TTY.
This is useful when piping the output of
.Nm
into a pager such as
.Xr less 1 .
.Pp
This option takes precedence over the environment variables described in
.Sx ENVIRONMENT
that relate to the usage of color.
.It Fl h , Fl Fl help
Display help information by opening this manual page.
.It Fl H , Fl Fl header\-line Ns = Ns Ar when
Control the usage of a dedicated header line,
where the filename and match position are printed on a dedicated line
above the match.
The available options for
.Ar when
are:
.Pp
.Bl -tag -width Ds -compact
.It never
never use a dedicated header line
.It always
always use a dedicated header line
.It multi
use a dedicated header line when the matched pattern spans multiple lines
.El
.It Fl i , Fl Fl ignore\-case
Match patterns case-insensitively.
.It Fl l , Fl Fl literal
Treat patterns as literal strings,
i.e. don’t interpret them as regular expressions.
.It Fl L , Fl Fl line\-position
Report the positions of matches as a (one-based) line- and column
position separated by a colon.
.Pp
This option may be ill-advised in many circumstances.
See
.Sx BUGS
for more details.
.It Fl p , Fl Fl predicate
Return an exit status indicating if a match was found without writing any
output to the standard output.
When simply checking for the presence of a pattern in an input,
this option is far more efficient than redirecting output to
.Pa /dev/null .
.It Fl s , Fl Fl strip\-newline
Don’t print a newline at the end of a match if the match already ends in
a newline.
This can make output seem more
.Sq natural ,
as many matches will already have terminating newlines.
.It Fl U , Fl Fl no\-unicode
Don’t use Unicode properties when matching \ed, \ew, etc.
Recognize only ASCII values instead.
.It Fl z , Fl Fl zero
Separate output data by null bytes
.Pq Sq \e0
instead of newlines.
This option can be used to process matches containing newlines.
.El
.Ss Pattern Syntax
A pattern is a sequence of whitespace-separated commands.
A command is a sequence of an operator,
an opening delimiter,
a regular expression,
a closing delimter,
and zero-or-more flags.
The last command of a pattern if given no flags need not have a closing
delimter.
.Pp
The supported operators are as follows:
.Pp
.Bl -tag -compact
.It g
Keep matches that match the given regex.
.It G
Keep matches that don’t match the given regex.
.It h
Highlight substrings in matches that match the given regex.
.It H
Highlight substrings in matches that don’t match the given regex.
.It x
Select everything that matches the given regex.
.It X
Select everything that doesn’t match the given regex.
.El
.Pp
An example pattern to match all numbers that contain a ‘3’ but aren’t
‘1337’ could be
.Sq x/[0\-9]+/ g/3/ G/^1337$/ .
In that pattern,
.Sq x/[0\-9]+/
selects all numbers in the input,
.Sq g/3/
keeps only those matches that contain the number 3,
and
.Sq G/^1337$/
filters out the specific number 1337.
.Pp
The opening- and closing-delimiter used for each given command can be any
valid UTF-8 codepoint.
As a result,
the following pattern using the delimiters
.Sq | ,
.Sq \&. ,
and
.Sq ä
is well-formed:
.Pp
.Dl x|[0\-9]+| g.3. Gä^1337$ä
.Pp
Delimeters also respect the Unicode
.Sq Bidirectional Paired Bracket
property.
This means that alongside the previous examples,
the following non-exhaustive list of character pairs may be used as
opening- and closing delimiters:
.Pp
.Bl -bullet -compact
.It
「…」
.It
⟮…⟯
.It
⟨…⟩
.El
.Pp
It is not recommended that you use characters that have a special meaning
in regular expression syntax as delimiters,
unless you’re using literal patterns via the
.Fl l
option or the
.Sq l
command flag.
.Pp
Operators are not allowed to take empty regular expression arguments with
one exception:
.Sq h .
When given an empty regular expression argument,
the
.Sq h
operator assumes the same regular expression as the previous operator.
This allows you to avoid duplication in the common case where a user
wishes to highlight text matched by a
.Sq g
or
.Sq x
operator.
The following example pattern selects all words that have a capital
letter,
and highlights the capital letter(s):
.Pp
.Dl x/\ew+/ g/\ep{Lu}/ h//
.Pp
The empty
.Sq h
operator is not permitted as the first operator in a pattern.
.Pp
While various command-line options exist to alter the behaviour of
patterns such as
.Fl i
to enable case-insensitive matching or
.Fl U
to disable Unicode support,
various different options can also be set at the command-level by
appending a command with one-or-more flags.
As an example,
one could match all sequences of one-or-more non-whitespace characters
that contain the case-insensitive literal string
.Sq [hi]
by using the following pattern:
.Pp
.Dl x/\eS+/ g/[hi]/li
.Pp
The currently supported flags are as follows:
.Pp
.Bl -tag -compact
.It i/I
enable or disable case-insensitive matching respectively
.It l/L
enable or disable treating the supplied regex as a fixed string
.It u/U
enable or disable Unicode support respectively
.El
.Sh ENVIRONMENT
.It Ev NO_COLOR
Do not display any colored output when set to a non-empty string,
even if the standard-output is a terminal.
This environment variable takes precedence over
.Ev CLICOLOR_FORCE .
.It Ev CLICOLOR_FORCE
Force display of colored output when set to a non-empty string,
even if the standard-output isn’t a terminal.
.It Ev TERM
If set to
.Sq dumb
disables colored output,
taking precedence over all other environment variables.
.El
.Sh EXIT STATUS
The
.Nm
utility exits with one of the following values:
.Pp
.Bl -tag -width Ds -offset indent -compact
.It Li 0
One or more matches were selected.
.It Li 1
No matches were selected.
.It Li 2
A non-fatal error occured,
such as failure to read a file.
.It Li >2
A fatal error occured.
.El
.Sh EXAMPLES
List all your systems CPU flags, sorted and without duplicates:
.Pp
.Dl $ grab 'x/^flags.*?$/ x/\ew+/ G/^flags$/' </proc/cpuinfo | sort -u
.Pp
Search for a pattern in multiple files without printing filenames or
position information:
.Pp
.Dl $ cat file1 file2 file3 | grab 'x/pattern/'
.Pp
Search for usages of an
.Ql <hb\-form\-text>
Vue component —
but only those which are being passed a
.Ql placeholder
property —
searching all files in the current git-repository:
.Pp
.Dl $ git grab 'x/<hb\-form\-text.*?>/ g/\ebplaceholder\eb/' '*.vue'
.Pp
Extract bibliographic references from
.Xr mdoc 7
formatted manual pages:
.Pp
.Dl $ grab 'x/(^\e.%.*?\en)+/' foo.1 bar.1
.Sh SEE ALSO
.Xr git 1 ,
.Xr grep 1 ,
.Xr pcre2syntax 3 ,
.Xr regex 7
.Rs
.%A Rob Pike
.%C "Murray Hill, New Jersey 07974"
.%D 1987
.%Q "AT&T Bell Laboratories"
.%T Structural Regular Expressions
.%U https://doc.cat\-v.org/bell_labs/structural_regexps/se.pdf
.Re
.Pp
.Lk https://en.wikipedia.org/wiki/ANSI_escape_code#SGR "SGR Parameters"
.Sh AUTHORS
.An Thomas Voss Aq Mt mail@thomasvoss.com
.Sh NOTES
When pattern matching with literal strings you should avoid using
delimeters that are contained within the search string as any backslashes
used to escape the delimeters will be searched for in the text literally.
.Sh BUGS
The pattern string provided as a command-line argument as well as the
provided input files must be encoded as UTF-8.
No other encodings are supported unless they are UTF-8 compatible,
such as ASCII.
.Pp
The
.Fl L
option has incredibly poor performance compared to the
.Fl b
option,
especially with very large inputs.
|