aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: 94eb25822e5f4bd53730b67ef9c979acf61d4dc3 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# Grab — A better grep

Grab is a more powerful version of the well-known Grep utility, making
use of structural regular expressions as described by Rob Pike in [this
paper][1].  Grab allows you to be far more precise with your searching
than Grep, as it doesn’t constrain itself to working only on individual
lines.


## Installation

To install grab, all you need is a C compiler:

```sh
$ cc -o make make.c  # Bootstrap the build script
$ ./make  # Build the project
$ ./make install  # Install the project
```


## Description

Grab invokations must include a pattern string which specifies which text
to match.  A pattern string consists of one or more commands.  A command
is an operator followed by a delimiter, a regular expression (regex), and
then terminated by the same delimiter.  The last delimiter of the last
command is optional.

For example, a pattern string may look like ‘`x/[a-z]+/ g.foo. v/bar/`’.

The available operators are ‘g’, ‘v’, ‘x’, and ‘y’.  The ‘g’ and ‘v’
operators are filter operators, while ‘x’ and ‘y’ are selection
operators.

You probably want to begin your pattern with a selection operator.  By
default the entire contents of the file you’re searching through will be
selected, but you probably want to shrink that down to a specific query.
With ‘x’ you can specify what text you want to select in the file.  For
example ‘`x/[0-9]+/`’ will select all numbers:

```sh
echo 'foo12bar34baz' | grab 'x/[0-9]+/'
# ⇒ 12
# ⇒ 34
```

The ‘y’ operator works in reverse, selecting everything that _doesn’t_
match the given regex:

```sh
echo 'foo12bar34baz' | grab 'y/[0-9]+/'
# ⇒ foo
# ⇒ bar
# ⇒ baz
```

You can additionally use filter operators to keep or discard certain
results.  The ‘g’ operator will filter out any results that don’t match
the given regex, while the ‘v’ operator will do the opposite.  To select
all numbers that contain a ‘3’ we can thus do:

``` sh
echo 'foo12bar34baz' | grab 'x/[0-9]+/ g/3/'
# ⇒ 34

# If we had used ‘x’ instead of ‘g’, the result would have just been ‘3’.
# Filter operators do not modify the selections; they merely filter them.
```

Likewise to select all numbers that don’t contain a ‘3’:

```sh
echo 'foo12bar34baz' | grab 'x/[0-9]+/ v/3/'
# ⇒ 12
```

You can also chain these together.  To get all numbers in a file that
contain a ‘3’ but aren’t the specific number ‘1337’, we could do the
following:

```sh
grab 'x/[0-9]+/ g/3/ v/^1337$/' /foo/bar
```


## Examples

### Get a list of your CPU flags.

```sh
# With Grep
grep '^flags' /proc/cpuinfo \
| sed 's/flags:\t*: //; y/ /\n/' \
| sort \
| uniq

# With Grab
grab 'x/^flags.*/ x/\w+/ v/flags/' /proc/cpuinfo \
| sort \
| uniq
```

1) Select lines that start with ‘flags’: `x/^flags.*/`
2) Select all the words: `x/\w+/`
3) Filter out the word ‘flags’: `v/flags/`


### Fomd `<my-tag>` tags with the attribute `data-attr` in a Git repo

```sh
git grab 'x/<my-tag[^>]*>/ g/data-attr/' '*.html'
```

1) Select all tags matching `<my-tag>`
2) Filter out tags without `data-attr`


## Additional Options

The Grab utility has a few options that may be helpful for your usecase.
For more detailed documentation, see the Grab manual with `man grab`.


[1]: https://doc.cat-v.org/bell_labs/structural_regexps/se.pdf