aboutsummaryrefslogtreecommitdiff
path: root/man/grab.1
blob: d8cba005e3fc7e7d42c556d1c8de3894243c5c4d (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
.Dd 26 January, 2024
.Dt GRAB 1
.Os Grab 2.2.1
.Sh NAME
.Nm grab ,
.Nm "git grab"
.Nd search for patterns in files
.Sh SYNOPSIS
.Nm
.Op Fl s | z
.Op Fl bcfinU
.Ar pattern
.Op Ar
.Nm
.Fl h
.Pp
.Nm "git grab"
.Op Fl s | z
.Op Fl bcinU
.Ar pattern
.Op Ar glob ...
.Nm "git grab"
.Fl h
.Sh DESCRIPTION
The
.Nm
utility searches for text in files corresponding to
.Ar pattern
and prints the corresponding matches to the standard output.
Unlike the
.Xr grep 1
utility,
.Nm
is not strictly line-oriented;
instead of always matching on complete lines,
the user defines the structure of the text they would like to match and
filters on the results.
For more details on the pattern syntax, see
.Sx Pattern Syntax .
.Pp
The
.Nm "git grab"
utility is identical to the
.Nm
utility in all ways except for two exceptions.
The first is that if no files
.Pq globs in this case to be precise
are specified,
input is not read from the standard-input but instead all non-binary
files found in the current git-repository are processed.
If the user provides one or more globs,
only the non-binary files in the current git-repository that match one or
more of the given globs will be processed.
Secondly, the
.Fl f
option is not available;
its behavior is always assumed and cannot be disabled.
.Pp
.Nm
will read from the files provided on the command-line.
If no files are provided, the standard input will be read instead.
The special filename
.Sq \-
can also be provided,
which represents the standard input.
.Pp
The default behavior of
.Nm
is to print pattern matches to the standard-output.
If more than one file argument is provided,
matches will be prefixed by their respective filename and a colon.
Note that this behavior is modified by the
.Fl f
and
.Fl z
options.
.Pp
The options are as follows:
.Bl -tag -width Ds
.It Fl b , Fl Fl byte\-offset
Report the positions of pattern matches using the byte offset/position in
the file instead of the line and column.
.Pp
This option is useful if your text editor
.Pq such as Xr vim 1 or Xr emacs 1
supports jumping directly to a given byte offset/position.
.It Fl c , Fl Fl color
Force colored output,
even if the output device is not a TTY.
This is useful when piping the output of
.Nm
into a pager such as
.Xr less 1 .
.Pp
Even when this option is specified,
if the
.Ev TERM
environment variable is set to
.Sq dumb ,
no color will be output.
.It Fl f , Fl Fl filenames
Always prefix matches with the names of the files in which the matches
were made,
even if only 1 file was provided.
.Pp
This option is always enabled when using
.Nm "git grab" .
.It Fl h , Fl Fl help
Display help information by opening this manual page.
.It Fl i , Fl Fl ignore\-case
Match patterns case-insensitively.
When PCRE support is available this option respects Unicode
.Po
i.e. the pattern
.Sq x/ß/
will match
.Sq 
.Pc .
.It Fl n , Fl Fl newline
Treat the newline as a special character by disallowing the dot
.Pq Sq \&.
wildcard from matching newlines in regular expressions.
.Pp
This option may behave strangely when
.Nm
is not compiled with PCRE support.
See
.Sx CAVEATS
for more information.
.It Fl s , Fl Fl strip\-newline
Don’t print a newline at the end of a match if the match already ends in
a newline.
This can make output seem more
.Sq natural ,
as many matches will already have terminating newlines.
.Pp
This option is mutually exclusive with the
.Fl z
option.
.It Fl U , Fl Fl no\-unicode
Don’t use Unicode properties when matching \ed, \ew, etc.
Recognize only ASCII values instead.
.Pp
If
.Nm
is not compiled with PCRE support this option will cause the program to
terminate with exit status 2.
.It Fl z , Fl Fl zero
Separate output data by null bytes
.Pq Sq \e0
instead of newlines.
This option can be used to process matches containing newlines.
.Pp
If combined with the
.Fl f
option,
or if two or more files were provided as arguments,
filenames and matches will be separated by null bytes instead of colons.
.Pp
This option is mutually exclusive with the
.Fl s
option.
.El
.Ss Regular Expression Syntax
By default
.Nm
supports Perl-compatible regular expressions
.Pq Sq PCREs ,
however it is possible to build and install
.Nm
without support for PCREs.
When built without PCRE support,
POSIX extended-regular-expressions are used instead.
.Pp
You should always assume that PCRE support is available,
but if you would like to be absolutely sure you can check if the program
terminates unsuccessfully when using the
.Fl U
option.
.Ss Pattern Syntax
A pattern is a sequence of commands optionally separated by whitespace.
A command is an operator followed by a delimiter, a regular expression,
and then terminated by the same delimiter.  The last command of a pattern
need not have a terminating delimiter.
.Pp
The supported operators are as follows:
.Pp
.Bl -tag -compact
.It g
Keep everything that matches the given regex.
.It G
Keep everything that doesn’t match the given regex.
.It h
Highlight everything that matches the given regex.
.It H
Highlight everything that doesn’t match the given regex.
.It x
Select everything that matches the given regex.
.It X
Select everything that doesn’t match the given regex.
.El
.Pp
An example pattern to match all numbers that contain a ‘3’ but aren’t
‘1337’ could be
.Sq x/[0\-9]+/ g/3/ G/^1337$/ .
In that pattern,
.Sq x/[0\-9]+/
selects all numbers in the input,
.Sq g/3/
keeps only those matches that contain the number 3,
and
.Sq G/^1337$/
filters out the specific number 1337.
.Pp
The delimiter used for each given operator can be any valid UTF-8
codepoint.
As a result,
the following pattern using the delimiters
.Sq | ,
.Sq \&. ,
and
.Sq ä
is well-formed:
.Pp
.Dl x|[0\-9]+| g.3. Gä^1337ä
.Pp
Operators are not allowed to take empty regular expression arguments with
one exception:
.Sq h .
When given an empty regular expression argument,
the
.Sq h
operator assumes the same regular expression as the previous operator.
This allows you to avoid duplication in the common case where a user
wishes to highlight text matched by a
.Sq g
operator.
The following example pattern selects all words that have a capital
letter,
and highlights the capital letter(s):
.Pp
.Dl x/\ew+/ g/[A\-Z]/ h//
.Pp
The empty
.Sq h
operator is not permitted as the first operator in a pattern.
.Sh ENVIRONMENT
.Bl -tag -width GRAB_COLORS
.It Ev GRAB_COLORS
A comma-separated list of color options in the form
.Sq key=val .
The value specified by
.Ar val
must be a SGR parameter.
For more information see
.Sx "SEE ALSO" .
.Pp
The keys are as follows:
.Pp
.Bl -tag -compact
.It fn
filenames prefixing any content line.
.It hl
text matched by an
.Sq h
or
.Sq H
command.
.It ln
line- and column-numbers,
as well as byte offsets when reporting the location of a match.
.It se
separators inserted between filenames and content lines.
.El
.Pp
The default value is
.Sq fn=35,hl=01;31,ln=32,se=36
.It Ev NO_COLOR
Do not display any colored output when set to a non-empty string,
even if the standard-output is a terminal.
.It Ev TERM
If set to
.Sq dumb
disables colored output,
even when the
.Fl c
option is provided.
.El
.Sh EXIT STATUS
.Ex -std
.Sh EXAMPLES
List all your systems CPU flags, sorted and without duplicates:
.Pp
.Dl $ grab 'x/^flags.*/ x/\ew+/ G/flags/' | sort | uniq
.Pp
Search for a pattern in multiple files without printing filenames:
.Pp
.Dl $ cat file1 file2 file3 | grab 'x/pattern/'
.Pp
Search for usages of an
.Ql <hb\-form\-text>
Vue component —
but only those which are being passed a
.Ql placeholder
property —
searching all files in the current git-repository:
.Pp
.Dl $ git grab 'x/<hb\-form\-text.*?>/ g/\ebplaceholder\eb/' '*.vue'
.Pp
Extract bibliographic references from
.Xr mdoc 7
formatted manual pages:
.Pp
.Dl $ grab \-n 'x/(^\e.%.*\en)+/' foo.1 bar.1
.Pp
Extract the
.Sx SYNOPSIS
section from the given
.Xr mdoc 7
formatted manual pages:
.Pp
.Dl $ grab \-n 'x/^\e.Sh SYNOPSIS\en(^.*\en(?!^\e.Sh))+/' foo.1 bar.1
.Sh SEE ALSO
.Xr git 1 ,
.Xr grep 1 ,
.Xr pcre2syntax 3 ,
.Xr regex 7
.Rs
.%A Rob Pike
.%C "Murray Hill, New Jersey 07974"
.%D 1987
.%Q "AT&T Bell Laboratories"
.%T Structural Regular Expressions
.%U https://doc.cat\-v.org/bell_labs/structural_regexps/se.pdf
.Re
.Pp
.Lk https://en.wikipedia.org/wiki/ANSI_escape_code#SGR "SGR Parameters"
.Sh AUTHORS
.An Thomas Voss Aq Mt mail@thomasvoss.com
.Sh CAVEATS
The behavior of negated character classes in regular expressions will
vary when given the
.Fl n
option depending on if PCRE support is or isn’t available.
.Pp
When PCRE support is available and the
.Fl n
option is provided,
the regular expression
.Ql [^a]
will nonetheless match the newline character.
When PCRE support is not available and the
.Fl n
option is provided,
the newline will
.Em not
be matched by
.Ql [^a] .
.Sh BUGS
The pattern string provided as a command-line argument as well as the
provided input files must be encoded as UTF-8.
No other encodings are supported unless they are UTF-8 compatible,
such as ASCII.