aboutsummaryrefslogtreecommitdiff
path: root/man/grab.1
blob: 37f2b8468e0acef559365acbd7c276243ac630dd (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
.Dd January 18 2024
.Dt GRAB 1
.Os
.Sh NAME
.Nm grab
.Nd search for patterns in files
.Sh SYNOPSIS
.Nm
.Op Fl cfnUz
.Ar pattern
.Op Ar
.Nm
.Fl h
.Pp
.Nm "git grab"
.Op Fl cnUz
.Ar pattern
.Op Ar glob ...
.Nm "git grab"
.Fl h
.Sh DESCRIPTION
The
.Nm
utility searches for text in files corresponding to
.Ar pattern
and prints the corresponding matches to the standard output.
Unlike the
.Xr grep 1
utility,
.Nm
is not strictly line-oriented;
instead of always matching on complete lines,
the user defines the structure of the text they would like to match and
filters on the results.
For more details on the pattern syntax, see
.Sx Pattern Syntax .
.Pp
The
.Nm git\-grab
utility is identical to the
.Nm
utility in all ways except for two exceptions.
The first is that if no files
.Pq globs in this case to be precise
are specified,
input is not read from the standard-input but instead all files returned
by an invocation of
.Xr git\-ls\-files 1
are processed.
If the user provides one or more globs,
only the files returned by
.Xr git\-ls\-files 1
that match one or more of the given globs will be processed.
Secondly, the
.Fl f
option is not available;
its behavior is always assumed and cannot be disabled.
.Pp
.Nm
will read from the files provided on the command-line.
If no files are provided, the standard input will be read instead.
The special filename
.Sq \-
can also be provided,
which represents the standard input.
.Pp
The default behavior of
.Nm
is to print pattern matches to the standard-output.
If more than one file argument is provided,
matches will be prefixed by their respective filename and a colon.
Note that this behavior is modified by the
.Fl f
and
.Fl z
options.
.Pp
The options are as follows:
.Bl -tag -width Ds
.It Fl c , Fl Fl color
Force colored output,
even if the output device is not a TTY.
This is useful when piping the output of
.Nm
into a pager such as
.Xr less 1 .
.Pp
Even when this option is specified,
if the
.Ev TERM
environment variable is set to
.Sq dumb ,
no color will be output.
.It Fl f , Fl Fl filenames
Always prefix matches with the names of the files in which the matches
were made,
even if only 1 file was provided.
.Pp
This option is always enabled when using
.Nm "git grab" .
.It Fl h , Fl Fl help
Display help information by opening this manual page.
.It Fl n , Fl Fl newline
Treat the newline as a special character by disallowing the dot
.Pq Sq \&.
wildcard from matching newlines in regular expressions.
.Pp
This option may behave strangely when
.Nm
is not compiled with PCRE support.
See
.Sx CAVEATS
for more information.
.It Fl U , Fl Fl no\-unicode
Don’t use Unicode properties when matching \ed, \ew, etc.
Recognize only ASCII values instead.
.Pp
If
.Nm
is not compiled with PCRE support this option will cause the program to
terminate with exit status 2.
.It Fl z , Fl Fl zero
Separate output data by null bytes
.Pq Sq \e0
instead of newlines.
This option can be used to process matches containing newlines.
.Pp
If combined with the
.Fl f
option,
or if two or more files were provided as arguments,
filenames and matches will be separated by null bytes instead of colons.
.El
.Ss Regular Expression Syntax
By default
.Nm
supports Perl-compatible regular expressions
.Pq Sq PCREs ,
however it is possible to build and install
.Nm
without support for PCREs.
When build without PCRE support,
POSIX extended-regular-expressions are used instead.
.Pp
You should always assume that PCRE support is available,
but if you would like to be absolutely sure you can check if the program
terminates unsuccessfully when using the
.Fl U
option.
.Ss Pattern Syntax
A pattern is a sequences of commands optionally separated by whitespace.
A command is an operator followed by a delimiter, a regular expression,
and then terminated by the same delimiter.  The last command of a pattern
need not have a terminating delimiter.
.Pp
The supported operators are as follows:
.Bl -tag -compact
.It g
Keep selections that match the given regex.
.It v
Discard selections that match the given regex.
.It x
Select everything that matches the given regex.
.It y
Select everything that doesn’t match the given regex.
.El
.Pp
An example pattern to match all numbers that contain a ‘3’ but aren’t
‘1337’ could be
.Sq x/[0\-9]+/ g/3/ v/^1337$/ .
In that pattern,
.Sq x/[0\-9]+/
selects all numbers in the input,
.Sq g/3/
keeps only those matches that contain the number 3,
and
.Sq v/^1337$/
filters out the specific number 1337.
.Pp
As you may use whichever delimiter you like, the following is also valid:
.Pp
.Dl x|[0\-9]+| g.3. v#^1337#
.Sh ENVIRONMENT
.Bl -tag -width GRAB_COLORS
.It Ev GRAB_COLORS
A comma-separated list of color options in the form
.Sq key=val .
The value specified by
.Ar val
must be a SGR parameter.
For more information see
.Sx "SEE ALSO" .
.Pp
The keys are as follows:
.Pp
.Bl -tag -compact
.It fn
filenames prefixing any content line.
.It se
separators inserted between filenames and content lines.
.El
.Pp
The default value is
.Sq fn=35,se=36
.It Ev NO_COLOR
Do not display any colored output when set to a non-empty string,
even if the standard-output is a terminal.
.It Ev TERM
If set to
.Sq dumb
disables colored output,
even when the
.Fl c
option is provided.
.El
.Sh EXIT STATUS
.Ex -std
.Sh EXAMPLES
List all your systems CPU flags, sorted and without duplicates:
.Pp
.Dl $ grab 'x/^flags.*/ x/\ew+/ v/flags/' | sort | uniq
.Pp
Search for a pattern in multiple files without printing filenames:
.Pp
.Dl $ cat file1 file2 file3 | grab 'x/pattern/'
.Pp
Search for usages of an
.Ql <hb\-form\-text>
Vue component —
but only those which are being passed a
.Ql placeholder
property —
searching all files in the current git-repository:
.Pp
.Dl $ git grab 'x/<hb\-form\-text.+?>/ g/\ebplaceholder\eb/' '*.vue'
.Pp
Extract bibliographic references from
.Xr mdoc 7
formatted manual pages:
.Pp
.Dl $ grab \-n 'x/(^\e.%.*\en)+/' foo.1 bar.1
.Pp
Extract the
.Sx SYNOPSIS
section from the given
.Xr mdoc 7
formatted manual pages:
.Pp
.Dl $ grab \-n 'x/^\.Sh SYNOPSIS\en(^.*\en(?!^\e.Sh))+/' foo.1 bar.1
.Sh SEE ALSO
.Xr git\-ls\-files 1 ,
.Xr grep 1 ,
.Xr pcre2syntax 3 ,
.Xr regex 7
.Rs
.%A Rob Pike
.%D 1987
.%T Structural Regular Expressions
.%U https://doc.cat\-v.org/bell_labs/structural_regexps/se.pdf
.Re
.Pp
.Lk https://en.wikipedia.org/wiki/ANSI_escape_code#SGR "SGR Parameters"
.Sh AUTHORS
.An Thomas Voss Aq Mt mail@thomasvoss.com
.Sh CAVEATS
The behavior of negated character classes in regular expressions will
vary when given the
.Fl n
option depending on if PCRE support is or isn’t available.
.Pp
When PCRE support is available and the
.Fl n
option is provided,
the regular expression
.Ql [^a]
will nontheless match the newline character.
When PCRE support is not available and the
.Fl n
option is provided,
the newline will
.Em not
be matched by
.Ql [^a] .
.Sh BUGS
When writing pattern matches to the standard output,
.Nm
appends a newline to the end of the match.
This often results in non-ideal output as matched patterns are often
already suffixed by a newline.
.Pp
Input files must be encoded as UTF-8.
No other encodings are supported unless they are UTF-8 compatible,
such as ASCII.